#457 - Abu-Rasheed 2024
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations

Abu-Rasheed, H.; Weber, C.; Fathi, M.; Ieee

15th IEEE Global Engineering Education Conference (IEEE EDUCON) 2024;():

Greece Ieee 2024

DOI: 10.1109/educon60312.2024.10578654 · Ref ID: 3157

In the era of personalized education, the provision of comprehensible explanations for learning recommendations is of great value to enhance the learner's understanding and engagement with the recommended learning content. Large language models (LLMs) and generative AI have recently opened new doors for generating human-like explanations, for and along learning recommendations. However, their precision is still far away from acceptable in a sensitive field like education. To harness the abilities of LLMs, while still ensuring a high level of precision towards the intent of the learners, this paper proposes an approach to utilize knowledge graphs (KG) as a source of factual context for LLM prompts, reducing the risk of model hallucinations, and safeguarding against wrong or imprecise information, while maintaining an application-intended learning context. We utilize the semantic relations in the knowledge graph to offer curated knowledge about learning recommendations. With domain-experts in the loop, we design the explanation as a textual template, which is filled and completed by the LLM. Domain experts were integrated in the prompt engineering phase as part of a study, to ensure that explanations include information that is relevant to the learner. We evaluate our approach quantitatively using Rouge-N and Rouge-L measures, as well as qualitatively with experts and learners. Our results show an enhanced recall and precision of the generated explanations compared to those generated solely by the GPT model, with a greatly reduced risk of generating imprecise information in the final learning explanation.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2469 - Abuosba 2015
Formalizing big data processing lifecycles: Acquisition, serialization, aggregation, analysis, mining, knowledge representation, and information dissemination

Abuosba, K.

2015 International Conference and Workshop on Computing and Communication (IEMCON) 2015;():1-4

2015

DOI: 10.1109/IEMCON.2015.7344533 · Ref ID: 6054

In today's e-Business environment, ERP, CRM, collaboration tools, and networked sensors may be characterized as data generators resources. Business Intelligence (BI) is a term that incorporates a range of analytical and decision support applications in business including data mining, decision support systems, knowledge management systems, and online analytical processing; processing data within these systems produce new data that are characterized to grow rapidly causing limitation problem of data management if handled by a Relational Database Management System (RDBMS) or statistical tools. Collectively these structured and unstructured data are referred to as Big Data. Successful and efficient handling of Big Data requires deployment of specific IT infrastructure components as well as adopting an emerging service model. In this research we introduce a conceptual model that abstracts the processing scheme of big data processing lifecycle. The model addresses the main phases of the lifecycle: data acquisition, data serialization, data aggregation, data analysis, data mining, knowledge representation, and information dissemination. The model is driven by projecting Service Oriented Architecture attributes to the building block of the lifecycle and adhering to the Lifecycle Modeling Language specification.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#335 - Addad 2024
Homeopathic Poisoning of RAG Systems

Addad, B.; Kapusta, K.

43rd International Conference on Computer Safety, Reliability and Security (SAFECOMP) 2024;14989():358-364

Florence, ITALY Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-68738-9_28 · Ref ID: 3762

Despite their remarkable success and wide use in many applications, large language models (LLMs) are not free from intrinsic vulnerabilities (e.g. prompt injection). They may also suffer from hallucinations and drop in performance due to lack of up-to-date knowledge. Retrieval-Augmented Generation (RAG) is currently one of the most promising techniques to mitigate such issues. In short, a RAG augments each prompt using a relevant context from an external knowledge database. Usually, the context is composed of texts that are the most similar to the request. While reducing hallucinations, RAG augments at the same time the attack surface of the whole system. Indeed, an attacker may poison the knowledge database by injecting bad or misleading information. In this paper, we introduce HOPRAG, a subtle, but very efficient, poisoning technique that consists in adding a suffix (or prefix) of only few tokens (sub-words) to any given text to raise (or decrease) its similarity with a prompt and therefore be used (or avoid being used) as context by RAG to answer. Our results show that with only three injected tokens, we manage to perform a successful attack.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#625 - Afshar 2024
On the role of the UMLS in supporting diagnosis generation proposed by Large Language Models

Afshar, M.; Gao, Y. J.; Gupta, D.; Croxford, E.; Demner-Fushman, D.

J. Biomed. Inform. 2024;157():9

2024

DOI: 10.1016/j.jbi.2024.104707 · Ref ID: 3688

Objective: Traditional knowledge-based and machine learning diagnostic decision support systems have benefited from integrating the medical domain knowledge encoded in the Unified Medical Language System (UMLS). The emergence of Large Language Models (LLMs) to supplant traditional systems poses questions of the quality and extent of the medical knowledge in the models' internal knowledge representations and the need for external knowledge sources. The objective of this study is three-fold: to probe the diagnosis-related medical knowledge of popular LLMs, to examine the benefit of providing the UMLS knowledge to LLMs (grounding the diagnosis predictions), and to evaluate the correlations between human judgments and the UMLS-based metrics for generations by LLMs. Methods: We evaluated diagnoses generated by LLMs from consumer health questions and daily care notes in the electronic health records using the ConsumerQA and Problem Summarization datasets. Probing LLMs for the UMLS knowledge was performed by prompting the LLM to complete the diagnosis-related UMLS knowledge paths. Grounding the predictions was examined in an approach that integrated the UMLS graph paths and clinical notes in prompting the LLMs. The results were compared to prompting without the UMLS paths. The final experiments examined the alignment of different evaluation metrics, UMLS-based and non-UMLS, with human expert evaluation. Results: In probing the UMLS knowledge, GPT-3.5 significantly outperformed Llama2 and a simple baseline yielding an F1 score of 10.9% in completing one-hop UMLS paths for a given concept. Grounding diagnosis predictions with the UMLS paths improved the results for both models on both tasks, with the highest improvement (4%) in SapBERT score. There was a weak correlation between the widely used evaluation metrics (ROUGE and SapBERT) and human judgments. Conclusion: We found that while popular LLMs contain some medical knowledge in their internal representations, augmentation with the UMLS knowledge provides performance gains around diagnosis generation. The UMLS needs to be tailored for the task to improve the LLMs predictions. Finding evaluation metrics that are aligned with human judgments better than the traditional ROUGE and BERT-based scores remains an open research question.

Davis voted
Ishan voted
Final decision
What was the agreed final decision?

#429 - Agarwal 2021
Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training

Agarwal, O.; Ge, H. M.; Shakeri, S.; Al-Rfou, R.; Assoc Computat, Linguist

Conference of the North-American-Chapter of the Association-for-Computational-Linguistics - Human Language Technologies (NAACL-HLT) 2021;():3554-3565

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 2979

Prior work on Data-To-Text Generation, the task of converting knowledge graph (KG) triples into natural text, focused on domain-specific benchmark datasets. In this paper, however, we verbalize the entire English Wiki-data KG, and discuss the unique challenges associated with a broad, open-domain, large-scale verbalization. We further show that verbalizing a comprehensive, encyclopedic KG like Wiki-data can be used to integrate structured KGs and natural language corpora. In contrast to the many architectures that have been developed to integrate these two sources, our approach converts the KG into natural text, allowing it to be seamlessly integrated into existing language models. It carries the further advantages of improved factual accuracy and reduced toxicity in the resulting language model. We evaluate this approach by augmenting the retrieval corpus in a retrieval language model and showing significant improvements on the knowledge intensive tasks of open domain QA and the LAMA knowledge probe.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#107 - Agrawal 2023
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Agrawal, A.; Arora, R.; Datta, A.; Banerjee, S.; Bhowmick, B.; Jatavallabhula, K. M.; Sridharan, M.; Krishna, M.; Ieee

32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 2023;():2604-2609

Busan, SOUTH KOREA Ieee 2023

DOI: 10.1109/ro-man57019.2023.10309325 · Ref ID: 3496

This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a) encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines. (1)

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1296 - Ahmed 2023
Explainable Integration of Knowledge Graphs Using Large Language Models

Ahmed, A. F.; Firmansyah, A. F.; Sherif, M. A.; Moussallem, D.; Ngonga Ngomo, A. C.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13913 LNCS():124-139

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-35320-8_9 · Ref ID: 5252

Linked knowledge graphs build the backbone of many data-driven applications such as search engines, conversational agents and e-commerce solutions. Declarative link discovery frameworks use complex link specifications to express the conditions under which a link between two resources can be deemed to exist. However, understanding such complex link specifications is a challenging task for non-expert users of link discovery frameworks. In this paper, we address this drawback by devising NMV-LS, a language model-based verbalization approach for translating complex link specifications into natural language. NMV-LS relies on the results of rule-based link specification verbalization to apply continuous training on T5, a large language model based on the Transformer architecture. We evaluated NMV-LS on English and German datasets using well-known machine translation metrics such as BLUE, METEOR, ChrF++ and TER. Our results suggest that our approach achieves a verbalization performance close to that of humans and outperforms state of the art approaches. Our source code and datasets are publicly available at https://github.com/dice-group/NMV-LS. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3731 - Ahn 2016
A Neural Knowledge Language Model

Ahn, Sungjin; Choi, Heeyoul; Pärnamaa, Tanel; Bengio, Yoshua

arXiv 2016;():

2016

Ref ID: 7353

Current language models have a significant limitation in the ability to encode and decode factual knowledge. This is mainly because they acquire such knowledge from statistical co-occurrences although most of the knowledge words are rarely observed. In this paper, we propose a Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by the knowledge graph with the RNN language model. By predicting whether the word to generate has an underlying fact or not, the model can generate such knowledge-related words by copying from the description of the predicted fact. In experiments, we show that the NKLM significantly improves the performance while generating a much smaller number of unknown words.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3791 - Ahrabian 2023
PubGraph: A Large-Scale Scientific Knowledge Graph

Ahrabian, Kian; Du, Xinwei; Myloth, Richard Delwin; Ananthan, Arun Baalaaji Sankar; Pujara, Jay

arXiv 2023;():

2023

Ref ID: 7643

Research publications are the primary vehicle for sharing scientific progress in the form of new discoveries, methods, techniques, and insights. Unfortunately, the lack of a large-scale, comprehensive, and easy-to-use resource capturing the myriad relationships between publications, their authors, and venues presents a barrier to applications for gaining a deeper understanding of science. In this paper, we present PubGraph, a new resource for studying scientific progress that takes the form of a large-scale knowledge graph (KG) with more than 385M entities, 13B main edges, and 1.5B qualifier edges. PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar, using the Wikidata ontology. Beyond the metadata available from these sources, PubGraph includes outputs from auxiliary community detection algorithms and large language models. To further support studies on reasoning over scientific networks, we create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion (KGC). These benchmarks present many challenges for knowledge graph embedding models, including an adversarial community-based KGC evaluation setting, zero-shot inductive learning, and large-scale learning. All of the aforementioned resources are accessible at https://pubgraph.isi.edu/ and released under the CC-BY-SA license. We plan to update PubGraph quarterly to accommodate the release of new publications.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1782 - Akbacak 2014
Rapidly building domain-specific entity-centric language models using semantic web knowledge sources

Akbacak, M.; Hakkani-Tür, D.; Tur, G.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2014;():2872-2876

International Speech and Communication Association 2014

Ref ID: 5806

For domain-specific speech recognition tasks, it is best if the statistical language model component is trained with text data that is content-wise and style-wise similar to the targeted domain for which the application is built. For state-of-the-art language modeling techniques that can be used in real-time within speech recognition engines during first-pass decoding (e.g., N-gram models), the above constraints have to be fulfilled in the training data. However collecting such data, even through crowd sourcing, is expensive and time consuming, and can still be not representative of how a much larger user population would interact with the recognition system. In this paper, we address this problem by employing several semantic web sources that already contain the domain-specific knowledge, such as query click logs and knowledge graphs. We build statistical language models that meet the requirements listed above for domain-specific recognition tasks where natural language is used and the user queries are about name entities in a specific domain. As a case study, in the movies domain where users' voice queries are movie related, compared to a generic web language model, a language model trained with the above resources not only yields significant perplexity and word-errorrate improvements, but also presents an approach where such language models can be rapidly developed for other domains. Copyright © 2014 ISCA.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1187 - AlHasanRony 2022
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation

Al Hasan Rony, M. R.; Usbeck, R.; Lehmann, J.

Findings of the Association for Computational Linguistics: NAACL 2022 - Findings 2022;():2557-2571

Association for Computational Linguistics (ACL) 2022

Ref ID: 5458

Task-oriented dialogue generation is challenging since the underlying knowledge is often dynamic and effectively incorporating knowledge into the learning process is hard. It is particularly challenging to generate both humanlike and informative responses in this setting. Recent research primarily focused on various knowledge distillation methods where the underlying relationship between the facts in a knowledge base is not effectively captured. In this paper, we go one step further and demonstrate how the structural information of a knowledge graph can improve the system's inference capabilities. Specifically, we propose DialoKG, a novel task-oriented dialogue system that effectively incorporates knowledge into a language model. Our proposed system views relational knowledge as a knowledge graph and introduces (1) a structure-aware knowledge embedding technique, and (2) a knowledge graph-weighted attention masking strategy to facilitate the system selecting relevant information during the dialogue generation. An empirical evaluation demonstrates the effectiveness of DialoKG over state-of-theart methods on several standard benchmark datasets. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2530 - Al-Sabahi 2018
A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)

Al-Sabahi, K.; Zuping, Z.; Nadher, M.

IEEE Access 2018;6():24205-24212

2018

DOI: 10.1109/ACCESS.2018.2829199 · Ref ID: 6131

The recent advance in neural network architecture and training algorithms has shown the effectiveness of representation learning. The neural-network-based models generate better representation than the traditional ones. They have the ability to automatically learn the distributed representation for sentences and documents. To this end, we proposed a novel model that addresses several issues that are not adequately modeled by the previously proposed models, such as the memory problem and incorporating the knowledge of document structure. Our model uses a hierarchical structured self-attention mechanism to create the sentence and document embeddings. This architecture mirrors the hierarchical structure of the document and in turn enables us to obtain better feature representation. The attention mechanism provides extra source of information to guide the summary extraction. The new model treated the summarization task as a classification problem in which the model computes the respective probabilities of sentence-summary membership. The model predictions are broken up by several features such as information content, salience, novelty, and positional representation. The proposed model was evaluated on two well-known datasets, the CNN/Daily Mail and DUC 2002. The experimental results show that our model outperforms the current extractive state of the art by a considerable margin.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3936 - Alam 2023
Towards Semantically Enriched Embeddings for Knowledge Graph Completion

Alam, Mehwish; van Harmelen, Frank; Acosta, Maribel

arXiv 2023;():

2023

Ref ID: 7789

Embedding based Knowledge Graph (KG) Completion has gained much attention over the past few years. Most of the current algorithms consider a KG as a multidirectional labeled graph and lack the ability to capture the semantics underlying the schematic information. In a separate development, a vast amount of information has been captured within the Large Language Models (LLMs) which has revolutionized the field of Artificial Intelligence. KGs could benefit from these LLMs and vice versa. This vision paper discusses the existing algorithms for KG completion based on the variations for generating KG embeddings. It starts with discussing various KG completion algorithms such as transductive and inductive link prediction and entity type prediction algorithms. It then moves on to the algorithms utilizing type information within the KGs, LLMs, and finally to algorithms capturing the semantics represented in different description logic axioms. We conclude the paper with a critical reflection on the current state of work in the community and give recommendations for future directions.

Davis voted
brandon voted
Final decision
What was the agreed final decision?

#1116 - Alatrash 2024
ConceptGCN: Knowledge concept recommendation in MOOCs based on knowledge graph convolutional networks and SBERT

Alatrash, R.; Chatti, M. A.; Ul Ain, Q.; Fang, Y.; Joarder, S.; Siepmann, C.

Comput. Educ. 2024;6():

2024

DOI: 10.1016/j.caeai.2023.100193 · Ref ID: 4045

Massive Open Online Courses (MOOCs) have gained popularity in the technology-enhanced learning (TEL) domain. To enhance the learning experience in MOOCs, educational recommender systems (ERSs) can play a crucial role by suggesting courses or learning materials that align with students' knowledge states. Thereby, understanding a student's learning needs and predicting knowledge concepts that the student might be interested in are important to provide effective recommendations. Inspired by the superior ability of knowledge graphs (KGs) in modeling the heterogeneous data in MOOCs and Graph Neural Networks (GNNs) in learning on graph-structured data, few works focusing on GNN-based recommendation of knowledge concepts in MOOCs have emerged recently. However, existing approaches in this domain have limitations mainly related to complexity, semantics, and transparency. To address these limitations, in this paper we propose ConceptGCN, an end-to-end framework that combines KGs, Graph Convolutional Networks (GCNs), and pre-trained transformer language model encoders (SBERT) to provide personalized and transparent recommendations of knowledge concepts in the MOOC platform [Blinded tool]. We conducted extensive offline experiments and an online user study (N=31), demonstrating the benefits of the ConceptGCN-based recommendation approach, in terms of several important user-centric aspects including accuracy, novelty, diversity, usefulness, overall satisfaction, use intentions, and reading intention. In particular, our results indicate that, if SBERT is used for the initial embeddings of items in the KG, a self-connection operation and a semantic similarity-based score function in the aggregation operation of GCN are not necessarily needed. © 2023 The Author(s)

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2025 - Alberts 2021
VisualSem: A High-quality Knowledge Graph for Vision & Language

Alberts, H.; Huang, N.; Deshpande, Y. R.; Liu, Y.; Cho, K.; Vania, C.; Calixto, I.

MRL 2021 - 1st Workshop on Multilingual Representation Learning, Proceedings of the Conference 2021;():138-152

Association for Computational Linguistics (ACL) 2021

Ref ID: 5600

An exciting frontier in natural language understanding (NLU) and generation (NLG) calls for (vision-and-) language models that can efficiently access external structured knowledge repositories. However, many existing knowledge bases only cover limited domains, or suffer from noisy data, and most of all are typically hard to integrate into neural language pipelines. To fill this gap, we release VisualSem: a high-quality knowledge graph (KG) which includes nodes with multilingual glosses, multiple illustrative images, and visually relevant relations. We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG. This multi-modal retrieval model can be integrated into any (neural network) model pipeline. We encourage the research community to use VisualSem for data augmentation and/or as a source of grounding, among other possible uses. VisualSem as well as the multi-modal retrieval models are publicly available and can be downloaded in this URL: https://github.com/iacercalixto/visualsem. © 2021 Association for Computational Linguistics.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1687 - Alfasi 2023
Next-Generation Security Entity Linkage: Harnessing the Power of Knowledge Graphs and Large Language

Alfasi, D.; Shapira, T.; Bremler-Barr, A.

Proceedings of the 16th ACM International Conference on Systems and Storage, SYSTOR 2023 2023;():150

Association for Computing Machinery, Inc 2023

DOI: 10.1145/3579370.3594759 · Ref ID: 4811

With the continuous increase in reported Common Vulnerabilities and Exposures (CVEs), security teams are overwhelmed by vast amounts of data, which are often analyzed manually, leading to a slow and inefficient process. To address cybersecurity threats effectively, it is essential to establish connections across multiple security entity databases, including CVEs, Common Weakness Enumeration (CWEs), and Common Attack Pattern Enumeration and Classification (CAPECs). In this study, we introduce a new approach that leverages the RotatE [4] knowledge graph embedding model, initialized with embeddings from Ada language model developed by OpenAI [3]. Additionally, we extend this approach by initializing the embeddings for the relations. © 2023 Owner/Author(s).

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3281 - Allen 2023
Conceptual Engineering Using Large Language Models

Allen, Bradley P.

arXiv 2023;():

2023

Ref ID: 7975

We describe a method, based on Jennifer Nado's definition of classification procedures as targets of conceptual engineering, that implements such procedures using a large language model. We then apply this method using data from the Wikidata knowledge graph to evaluate concept definitions from two paradigmatic conceptual engineering projects: the International Astronomical Union's redefinition of PLANET and Haslanger's ameliorative analysis of WOMAN. We discuss implications of this work for the theory and practice of conceptual engineering. The code and data can be found on GitHub.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3367 - Alper 2024
Emergent Visual-Semantic Hierarchies in Image-Text Representations

Alper, Morris; Averbuch-Elor, Hadar

arXiv 2024;():

2024

Ref ID: 8454

While recent vision-and-language models (VLMs) like CLIP are a powerful tool for analyzing text and images in a shared semantic space, they do not explicitly model the hierarchical nature of the set of texts which may describe an image. Conversely, existing multimodal hierarchical representation learning methods require costly training from scratch, failing to leverage the knowledge encoded by state-of-the-art multimodal foundation models. In this work, we study the knowledge of existing foundation models, finding that they exhibit emergent understanding of visual-semantic hierarchies despite not being directly trained for this purpose. We propose the Radial Embedding (RE) framework for probing and optimizing hierarchical understanding, and contribute the HierarCaps dataset, a benchmark facilitating the study of hierarchical knowledge in image–text representations, constructed automatically via large language models. Our results show that foundation VLMs exhibit zero-shot hierarchical understanding, surpassing the performance of prior models explicitly designed for this purpose. Furthermore, we show that foundation models may be better aligned to hierarchical reasoning via a text-only fine-tuning phase, while retaining pretraining knowledge.

Davis voted
Kwesi voted
Final decision
What was the agreed final decision?

#2540 - Alrimawi 2018
I've Seen This Before: Sharing Cyber-Physical Incident Knowledge

Alrimawi, F.; Pasquale, L.; Mehta, D.; Nuseibeh, B.

2018 IEEE/ACM 1st International Workshop on Security Awareness from Design to Deployment (SEAD) 2018;():33-40

2018

DOI: 10.1145/3194707.3194714 · Ref ID: 6433

An increasing number of security incidents in cyber-physical systems (CPSs) arise from the exploitation of cyber and physical components of such systems. Knowledge about how such incidents arose is rarely captured and used systematically to enhance security and support future incident investigations. In this paper, we propose an approach to represent and share incidents knowledge. Our approach captures incident patterns – common aspects of incidents occurring in different CPSs. Our approach then allows incident patterns to be instantiated for different systems to assess if and how such patterns can manifest again. To support our approach, we provide two meta-models that represent, respectively, incident patterns and the cyber-physical systems themselves. The incident meta-model captures the characteristics of incidents, such as assets and activities. The system meta-model captures cyber and physical components and their interactions, which may be exploited during an incident. We demonstrate the feasibility of our approach in the application domain of smart buildings, by tailoring the system meta-model to represent components and interactions in this domain.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3793 - Alshammari 2024
PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation

Alshammari, Suad; Basalelah, Lama; Rukbah, Walaa Abu; Alsuhibani, Ali; Wijesinghe, Dayanjan S.

arXiv 2024;():

2024

Ref ID: 8290

The exponential growth of scientific literature has resulted in information overload, challenging researchers to effectively synthesize relevant publications. This paper explores the integration of traditional reference management software with advanced computational techniques, including Large Language Models and Retrieval-Augmented Generation. We introduce PyZoBot, an AI-driven platform developed in Python, incorporating Zoteros reference management with OpenAIs sophisticated LLMs. PyZoBot streamlines knowledge extraction and synthesis from extensive human-curated scientific literature databases. It demonstrates proficiency in handling complex natural language queries, integrating data from multiple sources, and meticulously presenting references to uphold research integrity and facilitate further exploration. By leveraging LLMs, RAG, and human expertise through a curated library, PyZoBot offers an effective solution to manage information overload and keep pace with rapid scientific advancements. The development of such AI-enhanced tools promises significant improvements in research efficiency and effectiveness across various disciplines.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1652 - Alshomary 2024
Modeling the Quality of Dialogical Explanations

Alshomary, M.; Lange, F.; Booshehri, M.; Sengupta, M.; Cimiano, P.; Wachsmuth, H.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():11523-11536

European Language Resources Association (ELRA) 2024

Ref ID: 4569

Explanations are pervasive in our lives. Mostly, they occur in dialogical form where an explainer discusses a concept or phenomenon of interest with an explainee. Leaving the explainee with a clear understanding is not straightforward due to the knowledge gap between the two participants. Previous research looked at the interaction of explanation moves, dialogue acts, and topics in successful dialogues with expert explainers. However, daily-life explanations often fail, raising the question of what makes a dialogue successful. In this work, we study explanation dialogues in terms of the interactions between the explainer and explainee and how they correlate with the quality of explanations in terms of a successful understanding on the explainee's side. In particular, we first construct a corpus of 399 dialogues from the Reddit forum Explain Like I am Five and annotate it for interaction flows and explanation quality. We then analyze the interaction flows, comparing them to those appearing in expert dialogues. Finally, we encode the interaction flows using two language models that can handle long inputs, and we provide empirical evidence for the effectiveness boost gained through the encoding in predicting the success of explanation dialogues. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1947 - Amorim 2024
text2story: A Python Toolkit to Extract and Visualize Story Components of Narrative Text

Amorim, E.; Campos, R.; Jorge, A.; Mota, P.; Almeida, R.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():15761-15772

European Language Resources Association (ELRA) 2024

Ref ID: 4504

Story components, namely, events, time, participants, and their relations are present in narrative texts from different domains such as journalism, medicine, finance, and law. The automatic extraction of narrative elements encompasses several NLP tasks such as Named Entity Recognition, Semantic Role Labeling, Event Extraction, and Temporal Inference. The text2story Python, an easy-to-use modular library, supports the narrative extraction and visualization pipeline. The package contains an array of narrative extraction tools that can be used separately or in sequence. With this toolkit, end users can process free text in English or Portuguese and obtain formal representations, like standard annotation files or a formal logical representation. The toolkit also enables narrative visualization as Message Sequence Charts (MSC), Knowledge Graphs, and Bubble Diagrams, making it useful to visualize and transform human-annotated narratives. The package combines the use of off-the-shelf and custom tools and is easily patched (replacing existing components) and extended (e.g. with new visualizations). It includes an experimental module for narrative element effectiveness assessment and being is therefore also a valuable asset for researchers developing solutions for narrative extraction. To evaluate the baseline components, we present some results of the main annotators embedded in our package for datasets in English and Portuguese. We also compare the results with the extraction of narrative elements by GPT-3, a robust LLM model. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2058 - An 2023
Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data

An, B.

Math Biosci Eng 2023;20(4):6776-6799

2023

DOI: 10.3934/mbe.2023292 · Ref ID: 5871

The knowledge graph is a critical resource for medical intelligence. The general medical knowledge graph tries to include all diseases and contains much medical knowledge. However, it is challenging to review all the triples manually. Therefore the quality of the knowledge graph can not support intelligence medical applications. Breast cancer is one of the highest incidences of cancer at present. It is urgent to improve the efficiency of breast cancer diagnosis and treatment through artificial intelligence technology and improve the postoperative health status of breast cancer patients. This paper proposes a framework to construct a breast cancer knowledge graph from heterogeneous data resources in response to this demand. Specifically, this paper extracts knowledge triple from clinical guidelines, medical encyclopedias and electronic medical records. Furthermore, the triples from different data resources are fused to build a breast cancer knowledge graph (BCKG). Experimental results demonstrate that BCKG can support knowledge-based question answering, breast cancer postoperative follow-up and healthcare, and improve the quality and efficiency of breast cancer diagnosis, treatment and management.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#1304 - An 2022
Exploring Pre-Trained Language Models to Build Knowledge Graph for Metal-Organic Frameworks (MOFs)

An, Y.; Greenberg, J.; Hu, X.; Kalinowski, A.; Fang, X.; Zhao, X.; McClellan, S.; Uribe-Romo, F. J.; Langlois, K.; Furst, J.; Gomez-Gualdron, D. A.; Fajardo-Rojas, F.; Ardila, K.; Saikin, S. K.; Harper, C. A.; Daniel, R.

Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022 2022;():3651-3658

Institute of Electrical and Electronics Engineers Inc. 2022

DOI: 10.1109/BigData55660.2022.10020568 · Ref ID: 5450

Building a knowledge graph is a time-consuming and costly process which often applies complex natural language processing (NLP) methods for extracting knowledge graph triples from text corpora. Pre-trained large Language Models (PLM) have emerged as a crucial type of approach that provides readily available knowledge for a range of AI applications. However, it is unclear whether it is feasible to construct domain-specific knowledge graphs from PLMs. Motivated by the capacity of knowledge graphs to accelerate data-driven materials discovery, we explored a set of state-of-the-art pre-trained general-purpose and domain-specific language models to extract knowledge triples for metal-organic frameworks (MOFs). We created a knowledge graph benchmark with 7 relations for 1248 published MOF synonyms. Our experimental results showed that domain-specific PLMs consistently outperformed the general-purpose PLMs for predicting MOF related triples. The overall benchmarking results, however, show that using the present PLMs to create domain-specific knowledge graphs is still far from being practical, motivating the need to develop more capable and knowledgeable pre-trained language models for particular applications in materials science. © 2022 IEEE.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3591 - An 2023
Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM

An, Yuan; Greenberg, Jane; Kalinowski, Alex; Zhao, Xintong; Hu, Xiaohua; Uribe-Romo, Fernando J.; Langlois, Kyle; Furst, Jacob; Gómez-Gualdrón, Diego A.

arXiv 2023;():

2023

Ref ID: 7837

We present a comprehensive benchmark dataset for Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured databases and knowledge extracted from the literature. To enhance MOF-KG accessibility for domain experts, we aim to develop a natural language interface for querying the knowledge graph. We have developed a benchmark comprised of 161 complex questions involving comparison, aggregation, and complicated graph structures. Each question is rephrased in three additional variations, resulting in 644 questions and 161 KG queries. To evaluate the benchmark, we have developed a systematic approach for utilizing the LLM, ChatGPT, to translate natural language questions into formal KG queries. We also apply the approach to the well-known QALD-9 dataset, demonstrating ChatGPT's potential in addressing KGQA issues for different platforms and query languages. The benchmark and the proposed approach aim to stimulate further research and development of user-friendly and efficient interfaces for querying domain-specific materials science knowledge graphs, thereby accelerating the discovery of novel materials.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#73 - Anderson 2024
Bridging Domains in Chronic Lower Back Pain: Large Language Models and Ontology-Driven Strategies for Knowledge Graph Construction

Anderson, P.; Lin, D.; Davidson, J.; Migler, T.; Ho, I.; Koenig, C.; Bittner, M.; Kaplan, S.; Paraiso, M.; Buhn, N.; Stokes, E.; Hunt, C. A.; Ropella, G.; Lotz, J.

11th International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO) 2024;14849():14-30

Univ Granada, Meloneras, SPAIN Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-64636-2_2 · Ref ID: 2967

Link prediction and entity resolution play pivotal roles in uncovering hidden relationships within networks and ensuring data quality in the era of heterogeneous data integration. This paper explores the utilization of large language models to enhance link prediction, particularly through knowledge graphs derived from transdisciplinary literature. Investigating zero-shot entity resolution techniques, we examine the impact of ontology-based and large language model approaches on the stability of link prediction results. Through a case study focusing on chronic lower back pain research, we analyze workflow decisions and their influence on prediction outcomes. Our research underscores the importance of robust methodologies in improving predictive accuracy and data integration across diverse domains.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3158 - Anelli 2024
Sixth Knowledge-aware and Conversational Recommender Systems Workshop (KaRS)

Anelli, Vito Walter; Ferrara, Antonio; Musto, Cataldo; Narducci, Fedelucio; Ragone, Azzurra; Zanker, Markus

Proceedings of the 18th ACM Conference on Recommender Systems 2024;():1245–1249

Bari, Italy Association for Computing Machinery 2024

DOI: 10.1145/3640457.3687114 · Ref ID: 7283

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2813 - Angele 1996
Propose-and-revise modeled in Karl

Angele, J.

Proceedings Mexico-USA Collaboration in Intelligent Systems Technologies. 1996;():278-287

1996

Ref ID: 6174

This paper reports an evaluation study for the specification of an average sized expert system for configuring elevator systems using the language KARL (Knowledge Acquisition and Representation Language). Two results have been gained in this study: (i) a formal model of the used problem-solving method (PSM) Propose-and-Revise has been developed and (ii) the adequacy of the language KARL for specifying such systems has been evaluated. KARL is based on a strong conceptual model: the KARL model of expertise, which represents different aspects of the model at different layers. It clearly separates domain specific knowledge fim probleni-solving-specific knowledge which allows to reuse both parts independenty from the other. KARL provides language primitives on a high level of abstraction, independent of implementation issues. KARL is a formal language which allows to represent knowledge unambiguously. KARL is an executable language which allows to validate the resulting model by testing and debugging. It turned out that KARL is well-suited for such specification issues. It also turned out that due to a flexible connection between domain knowledge and problem solving knowledge provided by KARL both different kinds of knowledge may be specified nearly independenty from the other which supports their reuse. This study gave us various insights into the adequacy of the language KARL for representing the knowledge on an abstract level. In spite of the encouraging results we gained this study also revealed some deficiencies of the language KARL which are currently eliminated for a future version of KARL.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3188 - Anokhin 2024
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

Anokhin, Petr; Semenov, Nikita; Sorokin, Artyom; Evseev, Dmitry; Burtsev, Mikhail; Burnaev, Evgeny

arXiv 2024;():

2024

Ref ID: 8447

Advancements in the capabilities of Large Language Models (LLMs) have created a promising foundation for developing autonomous agents. With the right tools, these agents could learn to solve tasks in new environments by accumulating and updating their knowledge. Current LLM-based agents process past experiences using a full history of observations, summarization, retrieval augmentation. However, these unstructured memory representations do not facilitate the reasoning and planning essential for complex decision-making. In our study, we introduce AriGraph, a novel method wherein the agent constructs and updates a memory graph that integrates semantic and episodic memories while exploring the environment. We demonstrate that our Ariadne LLM agent, consisting of the proposed memory architecture augmented with planning and decision-making, effectively handles complex tasks within interactive text game environments difficult even for human players. Results show that our approach markedly outperforms other established memory methods and strong RL baselines in a range of problems of varying complexity. Additionally, AriGraph demonstrates competitive performance compared to dedicated knowledge graph-based methods in static multi-hop question-answering.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3106 - Aparna 2024
AI-Based Assistance for Management of Oral Community Knowledge in Low-Resource and Colloquial Kannada Language

Aparna, M.; Srivatsa, Sharath; Madhavan, G. Sai; Dinesh, T. B.; Srinivasa, Srinath

Big Data Analytics in Astronomy, Science, and Engineering: 11th International Conference on Big Data Analytics, BDA 2023, Aizu, Japan, December 5–7, 2023, Proceedings 2024;():3–16

Aizu, Japan Springer-Verlag 2024

DOI: 10.1007/978-3-031-58502-9_1 · Ref ID: 7269

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1283 - Arachchige 2023
Evaluating Large Language Models in Relationship Extraction from Unstructured Data: Empirical Study from Holocaust Testimonies

Arachchige, I. A. N.; Ha, L. A.; Mitkov, R.; Nahar, V.

International Conference Recent Advances in Natural Language Processing, RANLP 2023;():117-123

Incoma Ltd 2023

DOI: 10.26615/978-954-452-092-2_013 · Ref ID: 5047

Relationship extraction from unstructured data remains one of the most challenging tasks in the field of Natural Language Processing (NLP). The complexity of relationship extraction arises from the need to comprehend the underlying semantics, syntactic structures, and contextual dependencies within the text. Unstructured data poses challenges with diverse linguistic patterns, implicit relationships, contextual nuances, complicating accurate relationship identification and extraction. The emergence of Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer), has indeed marked a significant advancement in the field of NLP.In this work, we assess and evaluate the effectiveness of LLMs in relationship extraction in the Holocaust testimonies within the context of the Historical realm. By delving into this domainspecific context, we aim to gain deeper insights into the performance and capabilities of LLMs in accurately capturing and extracting relationships within the Holocaust domain by developing a novel knowledge graph to visualise the relationships of the Holocaust. To the best of our knowledge, there is no existing study which discusses relationship extraction in Holocaust testimonies. The majority of current approaches for Information Extraction (IE) in historic documents are either manual or Optical Character Recognition (OCR) based. Moreover, in this study, we found that the Subject-Object-Verb extraction using GPT3-based relations produced more meaningful results compared to the Semantic Role labelingbased triple extraction. © 2023 Incoma Ltd. All rights reserved.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2142 - Araújo 2016
Architectural approaches to build the museum of the person

Araújo, C.; Henriques, P. R.; Martini, R. G.; Almeida, J. J.

2016 11th Iberian Conference on Information Systems and Technologies (CISTI) 2016;():1-6

2016

DOI: 10.1109/CISTI.2016.7521367 · Ref ID: 6608

The Museum of the Person (Museu da Pessoa, MP) is a virtual museum aimed at exhibiting life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the web pages that realize the museum's exhibition rooms. This project started by creating a specific ontology (OntoMP) for the knowledge repository of MP. That ontology is intended to allow a conceptual navigation over the available information. We will adopt the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF to represent OntoMP. The objective of this paper is to discuss different architectural approaches to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also intercross information among them. The first architecture is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) technology to extract the information, while the second proposal is based on a Relational Database and uses CaVa Generator to query the repository and build the exhibition spaces.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#423 - Arnold 2022
Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots

Arnold, A.; Ernez, F.; Kobus, C.; Martin, M. C.

Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():188-196

Seattle, WA Assoc Computational Linguistics-Acl 2022

Ref ID: 3668

During their pre-flight briefings, aircraft pilots must analyze a long list of NOTAMs (NOtice To AirMen) indicating potential hazards along the flight route, sometimes up to 100 pages for long-haul flights. NOTAM free-text fields typically have a very special phrasing, with lots of acronyms and domain-specific vocabulary, which makes it differ significantly from standard English. In this paper, we pretrain language models derived from BERT on circa 1 million unlabeled NOTAMs and reuse the learnt representations on three downstream tasks valuable for pilots: criticality prediction, named entity recognition and translation into a structured language called Airlang. This self-supervised approach, where smaller amounts of labeled data are enough for task-specific finetuning, is well suited in the aeronautical context since expert annotations are expensive and time-consuming. We present evaluation scores across the tasks showing a high potential for an operational usability of such models (by pilots, airlines or service providers), which is a first to the best of our knowledge.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#375 - Aspillaga 2021
Inspecting the concept knowledge graph encoded by modern language models

Aspillaga, C.; Mendoza, M.; Soto, A.

Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():2984-3000

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 3072

The field of natural language understanding has experienced exponential progress in the last few years, with impressive results in several tasks. This success has motivated researchers to study the underlying knowledge encoded by these models. Despite this, attempts to understand their semantic capabilities have not been successful, often leading to non-conclusive, or contradictory conclusions among different works. Via a probing classifier, we extract the underlying knowledge graph of nine of the most influential language models of the last years, including word embeddings, text generators, and context encoders. This probe is based on concept relatedness, grounded on WordNet. Our results reveal that all the models encode this knowledge, but suffer from several inaccuracies. Furthermore, we show that the different architectures and training strategies lead to different model biases. We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging. We hope our insights will motivate the development of models that capture concepts more precisely.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#3760 - Avnat 2024
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As

Avnat, Eden; Levy, Michal; Herstain, Daniel; Yanko, Elia; Joya, Daniel Ben; Katz, Michal Tzuchman; Eshel, Dafna; Laros, Sahar; Dagan, Yael; Barami, Shahar; Mermelstein, Joseph; Ovadia, Shahar; Shomron, Noam; Shalev, Varda; Abdulnour, Raja-Elie E.

arXiv 2024;():

2024

Ref ID: 8359

Clinical problem-solving requires processing of semantic medical knowledge such as illness scripts and numerical medical knowledge of diagnostic tests for evidence-based decision-making. As large language models (LLMs) show promising results in many aspects of language-based clinical practice, their ability to generate non-language evidence-based answers to clinical questions is inherently limited by tokenization. Therefore, we evaluated LLMs' performance on two question types: numeric (correlating findings) and semantic (differentiating entities) while examining differences within and between LLMs in medical aspects and comparing their performance to humans. To generate straightforward multi-choice questions and answers (QAs) based on evidence-based medicine (EBM), we used a comprehensive medical knowledge graph (encompassed data from more than 50,00 peer-reviewed articles) and created the "EBMQA". EBMQA contains 105,000 QAs labeled with medical and non-medical topics and classified into numerical or semantic questions. We benchmarked this dataset using more than 24,500 QAs on two state-of-the-art LLMs: Chat-GPT4 and Claude3-Opus. We evaluated the LLMs accuracy on semantic and numerical question types and according to sub-labeled topics. For validation, six medical experts were tested on 100 numerical EBMQA questions. We found that both LLMs excelled more in semantic than numerical QAs, with Claude3 surpassing GPT4 in numerical QAs. However, both LLMs showed inter and intra gaps in different medical aspects and remained inferior to humans. Thus, their medical advice should be addressed carefully.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1427 - Azaria 2023
The Internal State of an LLM Knows When It's Lying

Azaria, A.; Mitchell, T.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():967-976

Association for Computational Linguistics (ACL) 2023

DOI: 10.18653/v1/2023.findings-emnlp.68 · Ref ID: 5057

While Large Language Models (LLMs) have shown exceptional performance in various tasks, one of their most prominent drawbacks is generating inaccurate or false information with a confident tone. In this paper, we provide evidence that the LLM's internal state can be used to reveal the truthfulness of statements. This includes both statements provided to the LLM, and statements that the LLM itself generates. Our approach is to train a classifier that outputs the probability that a statement is truthful, based on the hidden layer activations of the LLM as it reads or generates the statement. Experiments demonstrate that given a set of test sentences, of which half are true and half false, our trained classifier achieves an average of 71% to 83% accuracy labeling which sentences are true versus false, depending on the LLM base model. Furthermore, we explore the relationship between our classifier's performance and approaches based on the probability assigned to the sentence by the LLM. We show that while LLM-assigned sentence probability is related to sentence truthfulness, this probability is also dependent on sentence length and the frequencies of words in the sentence, resulting in our trained classifier providing a more reliable approach to detecting truthfulness, highlighting its potential to enhance the reliability of LLM-generated content and its practical applicability in real-world scenarios. © 2023 Association for Computational Linguistics.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#324 - Azim 2024
Grounding Ontologies with Pre-Trained Large Language Models for Activity Based Intelligence

Azim, A.; Clark, L.; Lau, C.; Cobb, M.; Jenner, K.

Conference on Signal Processing, Sensor/Information Fusion, and Target Recognition XXXIII 2024;13057():

National Harbor, MD Spie-Int Soc Optical Engineering 2024

DOI: 10.1117/12.3013332 · Ref ID: 3490

The development of Activity Based Intelligence (ABI) requires an understanding of individual actors' intents, their interactions with other entities in the environment, and how these interactions facilitate accomplishment of their goals. Statistical modelling alone is insufficient for such analyses, mandating higher-level representations such as ontology to capture important relationships. However, constructing ontologies for ABI, ensuring they remain grounded to real-world entities, and maintaining their applicability to downstream tasks requires substantial hand-tooling by domain experts. In this paper, we propose the use of a Large Language Model (LLM) to bootstrap a grounding for such an ontology. Subsequently, we demonstrate that the experience encoded within the weights of a pre-trained LLM can be used in a zero-shot manner to provide a model of normalcy, enabling ABI analysis at the semantics level, agnostic to the precise coordinate data. This is accomplished through a sequence of two transformations, made upon a kinematic track, toward natural language narratives suitable for LLM input. The first transformation generates an abstraction of the low-level kinematic track, embedding it within a knowledge graph using a domain-specific ABI ontology. Secondly, we employ a template-driven narrative generation process to form natural language descriptions of behavior. Computation of the LLM perplexity score upon these narratives achieves grounding of the ontology. This use does not rely on any prompt engineering. In characterizing the perplexity score for any given track, we observe significant variability given chosen parameters such as sentence verbosity, attribute count, clause ordering, and so on. Consequently, we propose an approach that considers multiple generated narratives for an individual track and the distribution of perplexity scores for downstream applications. We demonstrate the successful application of this methodology against a semantic track association task. Our subsequent analysis establishes how such an approach can be used to augment existing kinematics-based association algorithms.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#175 - Baek 2023
Direct Fact Retrieval from Knowledge Graphs without Entity Linking

Baek, J.; Aji, A. F.; Lehmann, J.; Hwang, S. J.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():10038-10055

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3227

There has been a surge of interest in utilizing Knowledge Graphs (KGs) for various natural language processing/understanding tasks. The conventional mechanism to retrieve facts in KGs usually involves three steps: entity span detection, entity disambiguation, and relation classification. However, this approach requires additional labels for training each of the three subcomponents in addition to pairs of input texts and facts, and also may accumulate errors propagated from failures in previous steps. To tackle these limitations, we propose a simple knowledge retrieval framework, which directly retrieves facts from the KGs given the input text based on their representational similarities, which we refer to as Direct Fact Retrieval (DiFaR). Specifically, we first embed all facts in KGs onto a dense embedding space by using a language model trained by only pairs of input texts and facts, and then provide the nearest facts in response to the input text. Since the fact, consisting of only two entities and one relation, has little context to encode, we propose to further refine ranks of top-k retrieved facts with a reranker that contextualizes the input text and the fact jointly. We validate our DiFaR framework on multiple fact retrieval tasks, showing that it significantly outperforms relevant baselines that use the three-step approach.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#463 - Baghdasaryan 2024
Knowledge retrieval and diagnostics in cloud services with large language models

Baghdasaryan, A.; Bunarjyan, T.; Poghosyan, A.; Harutyunyan, A.; El-Zein, J.

Expert Syst. Appl. 2024;255():10

2024

DOI: 10.1016/j.eswa.2024.124736 · Ref ID: 3684

Efficient customer support is the foundation for any service provider trying to improve customer relationships. An important measure of successful support is the mean time to resolve issues. The complexity and large scale of modern cloud environments make it unrealistic to reduce the resolution time without deploying intelligent solutions. The latest also provides an exceptional opportunity to leverage cross-customer product usage data for proactive solutions when the troubles of some users can be analyzed in advance to prevent similar issues of other users. We build a recommender system that matches customer support requests to other resolved support requests or knowledge base articles that contain valuable information for problem remediation. This system can be used by customers or support teams to quickly find problem-resolution tips or detect trending issues to warn vulnerable users. We utilize large language models, fine-tune for better performance, and discuss capabilities and possible improvements. During our research, we highlighted several evaluation metrics such as mean time to resolve issues and the accuracy of recommendations. However, estimating accuracy is challenging due to insufficient datasets with precise and comprehensive recommendations. Despite this, our support managers provided some estimates regarding the remediation durations. Typically, identifying and resolving an issue takes several days or weeks. With appropriate recommendations, this time can be significantly reduced to several hours and, in some simple cases, even lead to self-service capabilities.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3587 - Bahr 2024
Knowledge Graph Enhanced Retrieval-Augmented Generation for Failure Mode and Effects Analysis

Bahr, Lukas; Wehner, Christoph; Wewerka, Judith; Bittencourt, José; Schmid, Ute; Daub, Rüdiger

arXiv 2024;():

2024

Ref ID: 8426

Failure mode and effects analysis (FMEA) is a critical tool for mitigating potential failures, particular during ramp-up phases of new products. However, its effectiveness is often limited by the missing reasoning capabilities of the FMEA tools, which are usually tabular structured. Meanwhile, large language models (LLMs) offer novel prospects for fine-tuning on custom datasets for reasoning within FMEA contexts. However, LLMs face challenges in tasks that require factual knowledge, a gap that retrieval-augmented generation (RAG) approaches aim to fill. RAG retrieves information from a non-parametric data store and uses a language model to generate responses. Building on this idea, we propose to advance the non-parametric data store with a knowledge graph (KG). By enhancing the RAG framework with a KG, our objective is to leverage analytical and semantic question-answering capabilities on FMEA data. This paper contributes by presenting a new ontology for FMEA observations, an algorithm for creating vector embeddings from the FMEA KG, and a KG enhanced RAG framework. Our approach is validated through a human study and we measure the performance of the context retrieval recall and precision.

Kwesi voted
Mike voted
Final decision
What was the agreed final decision?

#480 - Bai 2023
KnowPrefix-Tuning: A Two-Stage Prefix-Tuning Framework for Knowledge-Grounded Dialogue Generation

Bai, J. Q.; Yan, Z.; Yang, Z.; Yang, J.; Liang, X. N.; Guo, H. C.; Li, Z. J.

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2023;14170():525-542

Turin, ITALY Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-43415-0_31 · Ref ID: 3766

Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner. They require a large knowledge base and a strong knowledge retrieval component, which is time- and resource-consuming. In this paper, we address the challenge by leveraging the inherent knowledge encoded in the pre-trained language models (PLMs). We propose Knowledgeable Prefix Tuning (KnowPrefix-Tuning), a two-stage tuning framework, bypassing the retrieval process in a knowledge-grounded conversation system by injecting prior knowledge into the lightweight knowledge prefix. The knowledge prefix is a sequence of continuous knowledge-specific vectors that can be learned during training. In addition, we propose a novel interactive re-parameterization mechanism that allows the prefix to interact fully with the PLM during the optimization of response generation. Experimental results demonstrate that KnowPrefix-Tuning outperforms fine-tuning and other lightweight tuning approaches, and performs comparably with strong retrieval-based baselines while being 3x faster during inference (The code is available at https://github.com/fantast4ever/KnowPrefix-Tuning.)

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#373 - Bai 2024
Infusing internalized knowledge of language models into hybrid prompts for knowledgeable dialogue generation

Bai, J. Q.; Yan, Z.; Zhang, S.; Yang, J.; Guo, H. C.; Li, Z. J.

Knowledge-Based Syst. 2024;296():11

2024

DOI: 10.1016/j.knosys.2024.111874 · Ref ID: 3511

Existing knowledge-grounded dialogue (KGD) systems access the knowledge from an external knowledge base, then generate the context-coherent response accordingly. However, the knowledge access capability is constrained to the scale of a knowledge base. On the one hand, a small-scale knowledge base makes a model hard to generalize on unseen topics, while the improper shift of topics may induce an unsmooth conversation flow. On the other hand, a large-scale knowledge base requires a strong retrieval component to accurately index the context-relevant knowledge from many plausible candidates, costing significant amounts of time and resources. To address this, we regard the language model as a virtual knowledge base and propose homogenizing internalized knowledge of different language models into hybrid prompts. The hybrid prompts are a set of continuous vectors learned to represent knowledge inherently encoded in different language models. Furthermore, we devise a two-stage knowledge-grounding manner, in which both the knowledge internalized in language models and the knowledge provided by evidence can be jointly optimized to generate a knowledgeable response. We compare our proposed method with two groups of methods, including methods with explicit knowledge retrieval and those with implicit knowledge access. Experimental results on three knowledge-grounded dialogue corpora demonstrate advantages over these competitive methods.

Davis voted
Ishan voted
Final decision
What was the agreed final decision?

#1731 - Baldazzi 2024
“Please, Vadalog, tell me why”: Interactive Explanation of Datalog-based Reasoning

Baldazzi, T.; Bellomarini, L.; Ceri, S.; Colombo, A.; Gentili, A.; Sallinger, E.

Advances in Database Technology - EDBT 2024;27():834-837

OpenProceedings.org 2024

DOI: 10.48786/edbt.2024.82 · Ref ID: 4080

Integrating Large Language Models (LLMs) with logic-based Enterprise Knowledge Graphs (EKGs) and more generally with Knowledge Representation and Reasoning (KRR) approaches is currently at the forefront of research in many data-intensive areas, as language models may complement EKGs and ontological reasoning with flexibility and human orientation. Conversely, EKGs provide transparency and explainability on the conclusions drawn, a typical weak point of LLMs, which operate opaquely. In this demo, we integrate Llama 2 with our reasoning system Vadalog and use it to turn a chase graph, i.e., the trace of an ontological reasoning process, into a human-readable business report. In other words, we show the amazing capabilities of state-of-the-art LLMs in combination with a principled exploitation of the theoretical underpinnings of logic-based reasoning. We walk the audience through a visual environment, unfolding real-world reasoning settings from the Central Bank of Italy. © 2024 Copyright held by the owner/author(s).

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1297 - Baldazzi 2024
Explaining Enterprise Knowledge Graphs with Large Language Models and Ontological Reasoning

Baldazzi, T.; Bellomarini, L.; Ceri, S.; Colombo, A.; Gentili, A.; Sallinger, E.; Atzeni, P.

OpenAccess Series in Informatics 2024;119():

Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing 2024

DOI: 10.4230/OASIcs.Tannen.2024.1 · Ref ID: 4000

In recent times, the demand for transparency and accountability in AI-driven decisions has intensified, particularly in high-stakes domains like finance and bio-medicine. This focus on the provenance of AI-generated conclusions underscores the need for decision-making processes that are not only transparent but also readily interpretable by humans, to built trust of both users and stakeholders. In this context, the integration of state-of-the-art Large Language Models (LLMs) with logic-oriented Enterprise Knowledge Graphs (EKGs) and the broader scope of Knowledge Representation and Reasoning (KRR) methodologies is currently at the cutting edge of industrial and academic research across numerous data-intensive areas. Indeed, such a synergy is paramount as LLMs bring a layer of adaptability and human-centric understanding that complements the structured insights of EKGs. Conversely, the central role of ontological reasoning is to capture the domain knowledge, accurately handling complex tasks over a given realm of interest, and to infuse the process with transparency and a clear provenance-based explanation of the conclusions drawn, addressing the fundamental challenge of LLMs' inherent opacity and fostering trust and accountability in AI applications. In this paper, we propose a novel neuro-symbolic framework that leverages the underpinnings of provenance in ontological reasoning to enhance state-of-the-art LLMs with domain awareness and explainability, enabling them to act as natural language interfaces to EKGs. © Teodoro Baldazzi, Luigi Bellomarini, Stefano Ceri, Andrea Colombo, Andrea Gentili, Emanuel Sallinger, and Paolo Atzeni; licensed under Creative Commons License CC-BY 4.0.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#3844 - Balepur 2024
Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?

Balepur, Nishant; Gu, Feng; Ravichander, Abhilasha; Feng, Shi; Boyd-Graber, Jordan; Rudinger, Rachel

arXiv 2024;():

2024

Ref ID: 8737

Question answering (QA)-producing correct answers for input questions-is popular, but we test a reverse question answering (RQA) task: given an input answer, generate a question with that answer. Past work tests QA and RQA separately, but we test them jointly, comparing their difficulty, aiding benchmark design, and assessing reasoning consistency. 16 LLMs run QA and RQA with trivia questions/answers, showing: 1) Versus QA, LLMs are much less accurate in RQA for numerical answers, but slightly more accurate in RQA for textual answers; 2) LLMs often answer their own invalid questions from RQA accurately in QA, so RQA errors are not from knowledge gaps alone; 3) RQA errors correlate with question difficulty and inversely correlate with answer frequencies in the Dolma corpus; and 4) LLMs struggle to give valid multi-hop questions. By finding question and answer types yielding RQA errors, we suggest improvements for LLM RQA reasoning.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#1838 - Banerjee 2023
The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Banerjee, D.; Nair, P. A.; Usbeck, R.; Biemann, C.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():12219-12228

Association for Computational Linguistics (ACL) 2023

Ref ID: 5199

In this work, we analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We perform experiments within the the context of knowledge graph question answering (KGQA), where the task is to convert questions in natural language to the SPARQL query language. We observe that the query vocabulary is distinct from human vocabulary. Language Models (LMs) are pre-dominantly trained for human language tasks, and hence, if the query vocabulary is replaced with a vocabulary more attuned to the LM tokenizer, the performance of models may improve. We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset. © 2023 Association for Computational Linguistics.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#748 - Banerjee 2020
Self-Supervised Knowledge Triplet Learning for Zero-Shot Question Answering

Banerjee, P.; Baral, C.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():151-162

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3298

The aim of all Question Answering (QA) systems is to generalize to unseen questions. Current supervised methods are reliant on expensive data annotation. Moreover, such annotations can introduce unintended annotator bias, making systems focus more on the bias than the actual task. This work proposes Knowledge Triplet Learning (KTL), a self-supervised task over knowledge graphs. We propose heuristics to create synthetic graphs for commonsense and scientific knowledge. We propose using KTL to perform zero-shot question answering, and our experiments show considerable improvements over large pre-trained transformer language models.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#497 - Banerjee 2024
Large Language Models for Few-Shot Automatic Term Extraction

Banerjee, S.; Chakravarthi, B. R.; McCrae, J. P.

29th International Conference on Applications of Natural Language to Information Systems (NLDB) 2024;14762():137-150

Univ Turin, Turin, ITALY Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-70239-6_10 · Ref ID: 3176

Automatic term extraction is the process of identifying domain-specific terms in a text using automated algorithms and is a key first step in ontology learning and knowledge graph creation. Large language models have shown good few-shot capabilities, thus, in this paper, we present a study to evaluate the few-shot in-context learning performance of GPT-3.5-Turbo on automatic term extraction. To benchmark the performance we compare the results with fine-tuning of a BERT-sized model. We also carry out experiments with count-based term extractors to assess their applicability to few-shot scenarios. We quantify prompt sensitivity with experiments to analyze the variation in performance of large language models across different prompt templates. Our results show that in-context learning with GPT-3.5-Turbo outperforms the BERT-based model and unsupervised count-based methods in few-shot scenarios.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#331 - Bao 2020
HHH: An Online Medical Chatbot System based on Knowledge Graph and Hierarchical Bi-Directional Attention

Bao, Q. M.; Ni, L.; Liu, J. M.; Assoc Comp, Machinery

Australasian Computer Science Week Multiconference (ACSW) 2020;():

Swinburne Univ Technol, Melbourne, AUSTRALIA Assoc Computing Machinery 2020

Ref ID: 3039

This paper proposes a chatbot framework that adopts a hybrid model which consists of a knowledge graph and a text similarity model. Based on this chatbot framework, we build HHH, an online question-and-answer (QA) Healthcare Helper system for answering complex medical questions. HHH maintains a knowledge graph constructed from medical data collected from the Internet. HHH also implements a novel text representation and similarity deep learning model, Hierarchical BiLSTM Attention Model (HBAM), to find the most similar question from a large QA dataset. We compare HBAM with other state-of-the-art language models such as bidirectional encoder representation from transformers (BERT) and Manhattan LSTM Model (MaLSTM). We train and test the models with a subset of the Quora duplicate questions dataset in the medical area. The experimental results show that our model is able to achieve a superior performance than these existing methods.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3335 - Bao 2023
DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

Bao, Zhijie; Chen, Wei; Xiao, Shengze; Ren, Kuang; Wu, Jiaao; Zhong, Cheng; Peng, Jiajie; Huang, Xuanjing; Wei, Zhongyu

arXiv 2023;():

2023

Ref ID: 7821

We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference rephrasing. These datasets are instrumental in training DISC-MedLLM, surpassing existing medical LLMs in both single-turn and multi-turn consultation scenarios. Extensive experimental results demonstrate the effectiveness of the proposed model in bridging the gap between general language models and real-world medical consultation. Additionally, we release the constructed dataset and model weights to further contribute to research and development. Further details and resources can be found at https://github.com/FudanDISC/DISC-MedLLM

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#1246 - Bayat 2024
Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression

Bayat, F. F.; Liu, X.; Jagadish, H. V.; Wang, L.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():12388-12400

Association for Computational Linguistics (ACL) 2024

Ref ID: 4285

Large language models (LLMs) can generate long-form and coherent text, yet they often hallucinate facts, which undermines their reliability. To mitigate this issue, inference-time methods steer LLM representations toward the “truthful directions” previously learned for truth elicitation. However, applying these truthful directions with the same intensity fails to generalize across different query contexts. We propose LITO, a Learnable Intervention method for Truthfulness Optimization that automatically identifies the optimal intervention intensity tailored to each specific context. LITO explores a sequence of model generations based on increasing levels of intervention intensities. It selects the most accurate response or refuses to answer when the predictions are highly uncertain. Experiments on multiple LLMs and question-answering datasets demonstrate that LITO improves truthfulness while preserving task accuracy. The adaptive nature of LITO counters the limitations of one-size-fits-all intervention methods, maximizing truthfulness by reflecting the model's internal knowledge only when it is confident. Our code is available at https://github.com/launchnlp/LITO. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#2628 - Bayrak 2005
Learning contextual behavior of text data

Bayrak, C.; Joshi, H.

Fourth International Conference on Machine Learning and Applications (ICMLA'05) 2005;():6 pp.

2005

DOI: 10.1109/ICMLA.2005.46 · Ref ID: 6089

Understanding contextual behavior is very important in order to develop a context-aware retrieval system. This paper discusses the philosophy behind the development of the "evolutionary behavior of textual semantics" (EBOTS) system. The EBOTS system is retrieval oriented knowledge representation and management system. This paper proposes a formal model of correlation that can be combined with traditional local and global weighing schemes. Intuitive contextual behavior is studied as a part of proposed research work. Context retrieval based on semantic knowledge allows abstract queries to be defined, instead of exact word-based queries. The results of the context retrieval for a classic3 and time dataset using the EBOTS system have been discussed in this paper. The paper makes a contribution to the semantic knowledge representation and retrieval algorithms.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3544 - Beigi 2024
InternalInspector I²: Robust Confidence Estimation in LLMs through Internal States

Beigi, Mohammad; Shen, Ying; Yang, Runing; Lin, Zihao; Wang, Qifan; Mohan, Ankith; He, Jianfeng; Jin, Ming; Lu, Chang-Tien; Huang, Lifu

arXiv 2024;():

2024

Ref ID: 8398

Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#36 - Bellan 2022
Assisted Process Knowledge Graph Building Using Pre-trained Language Models

Bellan, P.; Dragoni, M.; Ghidini, C.

21st International Conference of the Italian-Association-for-Artificial-Intelligence (AIxIA) 2022;13796():60-74

Udine, ITALY Springer International Publishing Ag 2022

DOI: 10.1007/978-3-031-27181-6_5 · Ref ID: 2928

The automated construction of knowledge graphs from procedural documents is a challenging research area. Here, the lack of annotated data, as well as raw text repositories describing real-world procedural documents, make it extremely difficult to adopt deep learning approaches. Pre-trained language models have shown promising results concerning the knowledge extraction tasks from the models themselves. Although several works explored this strategy to build knowledge graph, the viability of knowledge base construction by using prompt-based learning strategy from such language models has not yet been investigated deeply. In this work, we present a prompt-based in-context learning strategy to extract, from natural language process descriptions, conceptual information that can be converted into their equivalent knowledge graphs. Such a strategy is performed in a multi-turn dialog fashion. We validate the accuracy of the proposed approach from both quantitative and qualitative perspectives. The results highlight the feasibility of the proposed approach within low-resource scenarios.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#669 - Bellan 2024
Process Knowledge Extraction and Knowledge Graph Construction Through Prompting: A Quantitative Analysis

Bellan, P.; Dragoni, M.; Ghidini, C.; Assoc Computing, Machinery

39th Annual ACM Symposium on Applied Computing (SAC) 2024;():1634-1641

Univ Salamanca, Avila, SPAIN Assoc Computing Machinery 2024

DOI: 10.1145/3605098.3635957 · Ref ID: 3080

The automated construction of process knowledge graphs from process description documents is a challenging research area. Here, the lack of massive annotated data, as well as raw text repositories describing real-world process documents, makes it extremely difficult to adopt deep learning approaches to perform this transformation. Indeed, the main challenge is to extract conceptual elements representing the actual entities or relations of the process model described within its corresponding natural language document. Large Language Models (LLMs) have shown promising results in supporting the extraction of structured knowledge from unstructured texts. Although several works explored this strategy to build or complete knowledge graphs, the exploitation of LLMs toward domain-specific knowledge base construction from scratch has not yet been investigated deeply. Our aim is to exploit the LLM capabilities to extract process knowledge from unseen natural language descriptions. In this work, we present a prompt-based in-context learning strategy to extract, from process descriptions, conceptual information that can be converted into their equivalent knowledge graphs. Such a strategy is performed in a multi-turn dialog fashion. We validate the accuracy of the proposed approach from a quantitative perspective. The results highlight the feasibility of the proposed approach within our low-resource scenarios and open interesting perspectives for future activities.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3578 - Bendiken 2024
KNOW: A Real-World Ontology for Knowledge Capture with Large Language Models

Bendiken, Arto

arXiv 2024;():

2024

Ref ID: 8333

We present KNOW–the Knowledge Navigator Ontology for the World–the first ontology designed to capture everyday knowledge to augment large language models (LLMs) in real-world generative AI use cases such as personal AI assistants. Our domain is human life, both its everyday concerns and its major milestones. We have limited the initial scope of the modeled concepts to only established human universals: spacetime (places, events) plus social (people, groups, organizations). The inclusion criteria for modeled concepts are pragmatic, beginning with universality and utility. We compare and contrast previous work such as Schema.org and Cyc–as well as attempts at a synthesis of knowledge graphs and language models–noting how LLMs already encode internally much of the commonsense tacit knowledge that took decades to capture in the Cyc project. We also make available code-generated software libraries for the 12 most popular programming languages, enabling the direct use of ontology concepts in software engineering. We emphasize simplicity and developer experience in promoting AI interoperability.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#1115 - Bertini 2024
Concept2Text: an explainable multilingual rewriting of concepts into natural language

Bertini, F.; Dal Palù, A.; Fabiano, F.; Formisano, A.; Zaglio, F.

CEUR Workshop Proceedings 2024;3733():

CEUR-WS 2024

Ref ID: 4442

Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerging properties from data and representing them into abstract concepts, and (ii) translating such concepts into natural language. While Large Language Models have recently demonstrated impressive capabilities in generating natural language, their trustworthiness remains difficult to ascertain. The deployment of an explainable pipeline enables its application in high-risk activities, such as decision making. Addressing this demanding requirement is facilitated by the fertile ground of knowledge representation and automated reasoning research. Building upon previous work that explored the first step, we focus on the second step, named Concept2Text. The design of an explainable translation naturally lends itself to a logic-based model, once again highlighting the contribution of declarative programming to achieving explainability in AI. This paper explores a Prolog/CLP-based rewriting system designed to interpret concepts expressed in terms of classes and relations derived from a generic ontology, generating text in natural language. Its key features encompass hierarchical tree rewritings, modular multilingual generation, support for equivalent variants across semantic, grammar, and lexical levels, and a transparent rule-based system. We present the architecture and illustrate a simple working example that allows the generation of hundreds of different and equivalent rewritings relative to the input concept. © 2024 Copyright for this paper by its authors.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2423 - Beydoun 2009
FAML: A Generic Metamodel for MAS Development

Beydoun, G.; Low, G.; Henderson-Sellers, B.; Mouratidis, H.; Gomez-Sanz, J. J.; Pavon, J.; Gonzalez-Perez, C.

IEEE Transactions on Software Engineering 2009;35(6):841-863

2009

DOI: 10.1109/TSE.2009.34 · Ref ID: 6529

In some areas of software engineering research, there are several metamodels claiming to capture the main issues. Though it is profitable to have variety at the beginning of a research field, after some time, the diversity of metamodels becomes an obstacle, for instance to the sharing of results between research groups. To reach consensus and unification of existing metamodels, metamodel-driven software language engineering can be applied. This paper illustrates an application of software language engineering in the agent-oriented software engineering research domain. Here, we introduce a relatively generic agent-oriented metamodel whose suitability for supporting modeling language development is demonstrated by evaluating it with respect to several existing methodology-specific metamodels. First, the metamodel is constructed by a combination of bottom-up and top-down analysis and best practice. The concepts thus obtained and their relationships are then evaluated by mapping to two agent-oriented metamodels: TAO and Islander. We then refine the metamodel by extending the comparisons with the metamodels implicit or explicit within five more extant agent-oriented approaches: Adelfe, PASSI, Gaia, INGENIAS, and Tropos. The resultant FAML metamodel is a potential candidate for future standardization as an important component for engineering an agent modeling language.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#442 - Bhana 2022
Knowledge Graph Fusion for Language Model Fine-Tuning

Bhana, N.; van Zyl, T. L.; Ieee

9th International Conference on Soft Computing and Machine Intelligence (ISCMI) 2022;():167-172

Toronto, CANADA Ieee 2022

DOI: 10.1109/iscmi56532.2022.10068451 · Ref ID: 3063

Language Models such as BERT (Bidirectional Encoder Representations from Transformers) have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a sideeffect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1711 - Bhargava 2024
Overcoming the Challenges of Large Language Models: Introducing a Novel Proposition for Synthetic Data Validation

Bhargava, U.; Teresha, Y.; Koul, N.; Chavan, C. P.

2024 IEEE 7th International Conference on Big Data and Artificial Intelligence, BDAI 2024 2024;():290-295

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/BDAI62182.2024.10692968 · Ref ID: 4146

The market debut of ChatGPT gave rise to the development and deployment of various other Large Language Models (LLMs) that achieve state-of-the-art performance across various tasks. The growing popularity of these models has captivated some to attempt to construct or enhance their own LLM. We must be aware of the significant problems that already exist and that we might face along the way. This paper aims to identify and investigate the main challenges in this field, provide existing solutions, and propose novel approaches to mitigate them. A unique Truth-Table proposition for validating synthetic data is presented examining two models, along with a bidirectional knowledge graph-based solution for curing the reverse curse problem, data generation strategies, domain adaptation methods, and the use of a custom dataset to address model hallucinations. The methodology and findings of this study provide valuable insights for users, researchers, and industry experts who are interested in LLMs. It serves as a reference for future research on current models, refining models or developing domain-specific ones. © 2024 IEEE.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#367 - Bhatia 2023
Inductive Reasoning in Minds and Machines

Bhatia, S.

Psychol. Rev. 2023;():20

2023

DOI: 10.1037/rev0000446 · Ref ID: 3734

Induction-the ability to generalize from existing knowledge-is the cornerstone of intelligence. Cognitive models of human induction are largely limited to toy problems and cannot make quantitative predictions for the thousands of different induction arguments that have been studied by researchers, or to the countless induction arguments that could be encountered in everyday life. Leading large language models (LLMs) go beyond toy problems but fail to mimic observed patterns of human induction. In this article, we combine rich knowledge representations obtained from LLMs with theories of human inductive reasoning developed by cognitive psychologists. We show that this integrative approach can capture several benchmark empirical findings on human induction and generate human-like responses to natural language arguments with thousands of common categories and properties. These findings shed light on the cognitive mechanisms at play in human induction and show how existing theories in psychology and cognitive science can be integrated with new methods in artificial intelligence, to successfully model high-level human cognition.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3863 - Bhusal 2024
SECURE: Benchmarking Large Language Models for Cybersecurity Advisory

Bhusal, Dipkamal; Alam, Md Tanvirul; Nguyen, Le; Mahara, Ashim; Lightcap, Zachary; Frazier, Rodney; Fieblinger, Romy; Torales, Grace Long; Blakely, Benjamin A.; Rastogi, Nidhi

arXiv 2024;():

2024

Ref ID: 8336

Large Language Models (LLMs) have demonstrated potential in cybersecurity applications but have also caused lower confidence due to problems like hallucinations and a lack of truthfulness. Existing benchmarks provide general evaluations but do not sufficiently address the practical and applied aspects of LLM performance in cybersecurity-specific tasks. To address this gap, we introduce the SECURE (Security Extraction, Understanding & Reasoning Evaluation), a benchmark designed to assess LLMs performance in realistic cybersecurity scenarios. SECURE includes six datasets focussed on the Industrial Control System sector to evaluate knowledge extraction, understanding, and reasoning based on industry-standard sources. Our study evaluates seven state-of-the-art models on these tasks, providing insights into their strengths and weaknesses in cybersecurity contexts, and offer recommendations for improving LLMs reliability as cyber advisory tools.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#111 - Bi 2024
CodeKGC: Code Language Model for Generative Knowledge Graph Construction

Bi, Z.; Chen, J.; Jiang, Y. N.; Xiong, F. Y.; Guo, W.; Chen, H. J.; Zhang, N. Y.

ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2024;23(3):16

2024

DOI: 10.1145/3641850 · Ref ID: 2936

Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks. Intuitively, we address the task of generative knowledge graph constructionwith code languagemodel: given a code-format natural language input, the target is to generate triples which can be represented as code completion tasks. Specifically, we develop schema-aware prompts that effectively utilize the semantic structure within the knowledge graph. As code inherently possesses structure, such as class and function definitions, it serves as a useful model for prior semantic structural knowledge. Furthermore, we employ a rationale-enhanced generation method to boost the performance. Rationales provide intermediate steps, thereby improving knowledge extraction abilities. Experimental results indicate that the proposed approach can obtain better performance on benchmark datasets compared with baselines.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2676 - BinoPatricPrakash 2014
Mining semantic representation from medical text: A Bayesian approach

Bino Patric Prakash, G.; Jacob, S. G.; Radhameena, S.

2014 International Conference on Recent Trends in Information Technology 2014;():1-4

2014

DOI: 10.1109/ICRTIT.2014.6996197 · Ref ID: 6237

Machine learning is a subfield of artificial intelligence that deals with the exploration and construction of systems that can learn from data. Machine learning trains the computers to manage the critical situations via examining, self-training, inference by observation and previous experience. This paper provides an overview of the development of an efficient classifier that represents the semantics in medical data (Medline) using a Machine Learning (ML) perspective. In recent days people are more concerned about their health and explore ways to identify health related information. But the process of identifying the semantic representation for the medical terms is a difficult task. The main goal of our work was to identify the semantic representation for the medical abstracts in the Medline repository using Machine Learning and Natural Language Processing (NLP).

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#233 - Biswas 2022
Entity Type Prediction Leveraging Graph Walks and Entity Descriptions

Biswas, R.; Portisch, J.; Paulheim, H.; Sack, H.; Alam, M.

21st International Semantic Web Conference (ISWC) 2022;13489():392-410

Electr Network Springer International Publishing Ag 2022

DOI: 10.1007/978-3-031-19433-7_23 · Ref ID: 3398

The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation or human curation. Entity typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper presents GRAND, a novel approach for entity typing leveraging different graph walk strategies in RDF2vec together with textual entity descriptions. RDF2vec first generates graph walks and then uses a language model to obtain embeddings for each node in the graph. This study shows that the walk generation strategy and the embedding model have a significant effect on the performance of the entity typing task. The proposed approach outperforms the baseline approaches on the benchmark datasets DBpedia and FIGER for entity typing in KGs for both fine-grained and coarse-grained classes. The results show that the combination of orderaware RDF2vec variants together with the contextual embeddings of the textual entity descriptions achieve the best results.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#1135 - Biswas 2021
Contextual language models for knowledge graph completion

Biswas, R.; Sofronova, R.; Alam, M.; Sack, H.

CEUR Workshop Proceedings 2021;2997():

CEUR-WS 2021

Ref ID: 5604

Knowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the fine-tuning of the GPT-2 model for triple classification strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed. © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#89 - Biswas 2021
Cat2Type: Wikipedia Category Embeddings for Entity Typing in Knowledge Graphs

Biswas, R.; Sofronova, R.; Sack, H.; Alam, M.; Acm

11th Knowledge Capture Conference (K-CAP) 2021;():81-88

Electr Network Assoc Computing Machinery 2021

DOI: 10.1145/3460210.3493575 · Ref ID: 3239

The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation. Entity Typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper introduces an approach named Cat2Type which exploits the Wikipedia Categories to predict the missing entity types in a KG. This work extracts information from Wikipedia Category names and the Wikipedia Category graph which are the sources of rich semantic information about the entities. In Cat2Type, the characteristic features of the entities encapsulated in Wikipedia Category names are exploited using Neural Language Models. On the other hand, a Wikipedia Category graph is constructed to capture the connection between the categories. The Node level representations are learned by optimizing the neighbourhood information on the Wikipedia category graph. These representations are then used for entity type prediction via classification. The performance of Cat2Type is assessed on two real-world benchmark datasets DBpedia630k and FIGER. The experiments depict that Cat2Type obtained a significant improvement over state-of-the-art approaches.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2151 - Bleidt 2024
ArtQuest: Countering Hidden Language Biases in ArtVQA

Bleidt, T.; Eslami, S.; Melo, G. de

2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024;():7311-7320

2024

DOI: 10.1109/WACV57701.2024.00716 · Ref ID: 6992

The task of Visual Question Answering (VQA) has been studied extensively on general-domain real-world images. Transferring insights from general domain VQA to the art domain (ArtVQA) is non-trivial, as the latter requires models to identify abstract concepts, details of brushstrokes and styles of paintings in the visual data as well as possess background knowledge about art. This is exacerbated by the lack of high-quality datasets. In this work, we shed light on hidden linguistic biases in the AQUA dataset, which is the only publicly available benchmark dataset for ArtVQA. As a result, the majority of questions can be answered without consulting the visual information, making the “V” in ArtVQA rather insignificant. In order to counter this problem, we create a simple, yet practical dataset, ArtQuest, using structured information from the SemArt collection. Our dataset and the pipeline to reproduce our results are publicly available at https://github.com/bletib/artquest.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#3491 - Boer 2024
Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering

Boer, Derian; Koch, Fabian; Kramer, Stefan

arXiv 2024;():

2024

Ref ID: 8575

Large Language Models (LLMs) frequently lack domain-specific knowledge and even fine-tuned models tend to hallucinate. Hence, more reliable models that can include external knowledge are needed. We present a pipeline, 4StepFocus, and specifically a preprocessing step, that can substantially improve the answers of LLMs. This is achieved by providing guided access to external knowledge making use of the model's ability to capture relational context and conduct rudimentary reasoning by themselves. The method narrows down potentially correct answers by triplets-based searches in a semi-structured knowledge base in a direct, traceable fashion, before switching to latent representations for ranking those candidates based on unstructured data. This distinguishes it from related methods that are purely based on latent representations. 4StepFocus consists of the steps: 1) Triplet generation for extraction of relational data by an LLM, 2) substitution of variables in those triplets to narrow down answer candidates employing a knowledge graph, 3) sorting remaining candidates with a vector similarity search involving associated non-structured data, 4) reranking the best candidates by the LLM with background data provided. Experiments on a medical, a product recommendation, and an academic paper search test set demonstrate that this approach is indeed a powerful augmentation. It not only adds relevant traceable background information from information retrieval, but also improves performance considerably in comparison to state-of-the-art methods. This paper presents a novel, largely unexplored direction and therefore provides a wide range of future work opportunities. Used source code is available at https://github.com/kramerlab/4StepFocus.

Davis voted
brandon voted
Final decision
What was the agreed final decision?

#799 - Bombieri 2024
Surgicberta: a pre-trained language model for procedural surgical language

Bombieri, M.; Rospocher, M.; Ponzetto, S. P.; Fiorini, P.

Int. J, Data Sci. Anal. 2024;18(1):69-81

2024

DOI: 10.1007/s41060-023-00433-5 · Ref ID: 3518

Pre-trained language models are now ubiquitous in natural language processing, being successfully applied for many different tasks and in several real-world applications. However, even though there is a wealth of high-quality written materials on surgery, and the scientific community has shown a growing interest in the application of natural language processing techniques in surgery, a pre-trained language model specific to the surgical domain is still missing. The creation and public release of such a model would serve numerous useful clinical applications. For example, it could enhance existing surgical knowledge bases employed for task automation, or assist medical students in summarizing complex surgical descriptions. For this reason, in this paper, we introduce SurgicBERTa, a pre-trained language model specific for the English surgical language, i.e., the language used in the surgical domain. SurgicBERTa has been obtained from RoBERTa through continued pre-training with the Masked language modeling objective on 300 k sentences taken from English surgical books and papers, for a total of 7 million words. By publicly releasing SurgicBERTa, we make available a resource built from the content collected in many high-quality surgical books, online textual resources, and academic papers. We performed several assessments in order to evaluate SurgicBERTa, comparing it with the general domain RoBERTa. First, we intrinsically assessed the model in terms of perplexity, accuracy, and evaluation loss resulting from the continual training according to the masked language modeling task. Then, we extrinsically evaluated SurgicBERTa on several downstream tasks, namely (i) procedural sentence detection, (ii) procedural knowledge extraction, (iii) ontological information discovery, and (iv) surgical terminology acquisition. Finally, we conducted some qualitative analysis on SurgicBERTa, showing that it contains a lot of surgical knowledge that could be useful to enrich existing state-of-the-art surgical knowledge bases or to extract surgical knowledge. All the assessments show that SurgicBERTa better deals with surgical language than a general-purpose pre-trained language model such as RoBERTa, and therefore can be effectively exploited in many computer-assisted applications in the surgical domain.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2978 - Bork 2018
Systematic analysis and evaluation of visual conceptual modeling language notations

Bork, D.; Karagiannis, D.; Pittl, B.

2018 12th International Conference on Research Challenges in Information Science (RCIS) 2018;():1-11

2018

DOI: 10.1109/RCIS.2018.8406652 · Ref ID: 6377

In systems analysis and design it is common to refer to some widely used de-facto industry standards like Unified Modeling Language (UML) and Business Process Model and Notation (BPMN). Albeit the wide adoption of such standard modeling languages, only limited research focuses on the techniques in which these standards are specified and the quality they provide. Most research focuses on case studies of applying standards, ways of extending standards to domain-specific requirements, e.g., by means of profiling, or evaluations of single modeling languages, e.g., using questionnaires or semiotic theories. By contrast, this paper critically reflects on the current state of modeling standards with a focus on their graphical representation (notation). The contribution of this paper is threefold: First, a systematic analysis is performed thereby investigating how different modeling standards specify notational aspects. Second, an evaluation is performed by applying Moody's Physics of Notation theory to the identified standards. Third, based on the findings, recommendations are given to improve modeling standard specifications in the future w.r.t. their notational aspects.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1639 - Boscariol 2024
A METHODOLOGICAL APPROACH TO ASSET INFORMATION MANAGEMENT VIA KNOWLEDGE GRAPHS AND LARGE LANGUAGE MODELS

Boscariol, M.; Meschini, S.; Tagliabue, L. C.

Proceedings of the European Conference on Computing in Construction 2024;2024():404-411

European Council on Computing in Construction (EC3) 2024

DOI: 10.35490/EC3.2024.286 · Ref ID: 4361

Tackling the need of large organizations for a proactive Asset Information Management (AIM) System, a methodological approach to knowledge management applied to built assets portfolios is proposed. It aims at synergically leveraging Knowledge Graphs (KGs) and Artificial Intelligence (AI) technologies to enable analytics on input data. In the theorized pipeline Large Language Models (LLMs) are meant to be used both in the graph creation phase, extracting data from unstructured sources and organizing them according to domain ontologies, as tested on a use-case sample, and in the knowledge extraction phase via queries. © 2024 European Council on Computing in Construction.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#122 - Bosselut 2019
COMET(sic): Commonsense Transformers for Automatic Knowledge Graph Construction

Bosselut, A.; Rashkin, H.; Sap, M.; Malaviya, C.; Celikyilmaz, A.; Choi, Y.; Acl

57th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2019;():4762-4779

Florence, ITALY Assoc Computational Linguistics-Acl 2019

Ref ID: 3052

We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional 103s that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET(sic)) that learn to generate rich and diverse conunonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1735 - Boudin 2010
Positional language models for clinical information retrieval

Boudin, F.; Nie, J. Y.; Dawes, M.

EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference 2010;():108-115

2010

Ref ID: 5815

The PECO framework is a knowledge representation for formulating clinical questions. Queries are decomposed into four aspects, which are Patient-Problem (P), Exposure (E), Comparison (C) and Outcome (O). However, no test collection is available to evaluate such framework in information retrieval. In this work, we first present the construction of a large test collection extracted from systematic literature reviews. We then describe an analysis of the distribution of PECO elements throughout the relevant documents and propose a language modeling approach that uses these distributions as a weighting strategy. In our experiments carried out on a collection of 1.5 million documents and 423 queries, our method was found to lead to an improvement of 28% in MAP and 50% in P@5, as compared to the state-of-the-art method. © 2010 Association for Computational Linguistics.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#521 - Bouzid 2024
Leveraging Generative AI in Short Document Indexing

Bouzid, S.; Piron, L.

Electronics 2024;13(17):24

2024

DOI: 10.3390/electronics13173563 · Ref ID: 3593

The efficiency of information retrieval systems primarily depends on the effective representation of documents during query processing. This representation is mainly constructed from relevant document terms identified and selected during their indexing, which are then used for retrieval. However, when documents contain only a few features, such as in short documents, the resulting representation may be information-poor due to a lack of index terms and their lack of relevance. Although document representation can be enriched using techniques like word embeddings, these techniques require large pre-trained datasets, which are often unavailable in the context of domain-specific short documents. This study investigates a new approach to enrich document representation during indexing using generative AI. In the proposed approach, relevant terms extracted from documents and preprocessed for indexing are enriched with a list of key terms suggested by a large language model (LLM). After conducting a small benchmark of several renowned LLM models for key term suggestions from a set of short texts, the GPT-4o model was chosen to experiment with the proposed indexing approach. The findings of this study yielded notable results, demonstrating that generative AI can efficiently fill the knowledge gap in document representation, regardless of the retrieval technique used.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3960 - Bronzini 2024
Unveiling LLMs: The Evolution of Latent Representations in a Dynamic Knowledge Graph

Bronzini, Marco; Nicolini, Carlo; Lepri, Bruno; Staiano, Jacopo; Passerini, Andrea

arXiv 2024;():

2024

Ref ID: 8218

Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge. However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area. This work unveils the factual information an LLM represents internally for sentence-level claim verification. We propose an end-to-end framework to decode factual knowledge embedded in token representations from a vector space to a set of ground predicates, showing its layer-wise evolution using a dynamic knowledge graph. Our framework employs activation patching, a vector-level technique that alters a token representation during inference, to extract encoded knowledge. Accordingly, we neither rely on training nor external models. Using factual and common-sense claims from two claim verification datasets, we showcase interpretability analyses at local and global levels. The local analysis highlights entity centrality in LLM reasoning, from claim-related information and multi-hop reasoning to representation errors causing erroneous evaluation. On the other hand, the global reveals trends in the underlying evolution, such as word-based knowledge evolving into claim-related facts. By interpreting semantics from LLM latent representations and enabling graph-related analyses, this work enhances the understanding of the factual knowledge resolution process.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#553 - Buehler 2024
MechGPT, a Language-Based Strategy for Mechanics and Materials Modeling That Connects Knowledge Across Scales, Disciplines, and Modalities

Buehler, M. J.

Appl. Mech. Rev. 2024;76(2):35

2024

DOI: 10.1115/1.4063843 · Ref ID: 3551

For centuries, researchers have sought out ways to connect disparate areas of knowledge. While early scholars (Galileo, da Vinci, etc.) were experts across fields, specialization took hold later. With the advent of Artificial Intelligence, we can now explore relationships across areas (e.g., mechanics-biology) or disparate domains (e.g., failure mechanics-art). To achieve this, we use a fine-tuned large language model (LLM), here for a subset of knowledge in multiscale materials failure. The approach includes the use of a general-purpose LLM to distill question-answer pairs from raw sources followed by LLM fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas. While the model has some ability to recall knowledge from training, we find that LLMs are particularly useful for extracting structural insights through Ontological Knowledge Graphs. These interpretable graph structures provide explanatory insights, frameworks for new research questions, and visual representations of knowledge that also can be used in retrieval-augmented generation. Three versions of MechGPT are discussed, featuring different sizes from 13 x 109 to 70 x 109 parameters, and reaching context lengths of more than 10,000 tokens. This provides ample capacity for sophisticated retrieval augmented strategies, as well as agent-based modeling where multiple LLMs interact collaboratively and/or adversarially, the incorporation of new data from the literature or web searches, as well as multimodality.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#310 - Buehler 2024
Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Interpretive Large Language Model-Based Materials Design

Buehler, M. J.

ACS Eng. Au 2024;4(2):241-277

2024

DOI: 10.1021/acsengineeringau.3c00058 · Ref ID: 3519

Transformer neural networks show promising capabilities, in particular for uses in materials analysis, design, and manufacturing, including their capacity to work effectively with human language, symbols, code, and numerical data. Here, we explore the use of large language models (LLMs) as a tool that can support engineering analysis of materials, applied to retrieving key information about subject areas, developing research hypotheses, discovery of mechanistic relationships across disparate areas of knowledge, and writing and executing simulation codes for active knowledge generation based on physical ground truths. Moreover, when used as sets of AI agents with specific features, capabilities, and instructions, LLMs can provide powerful problem-solution strategies for applications in analysis and design problems. Our experiments focus on using a fine-tuned model, MechGPT, developed based on training data in the mechanics of materials domain. We first affirm how fine-tuning endows LLMs with a reasonable understanding of subject area knowledge. However, when queried outside the context of learned matter, LLMs can have difficulty recalling correct information and may hallucinate. We show how this can be addressed using retrieval-augmented Ontological Knowledge Graph strategies. The graph-based strategy helps us not only to discern how the model understands what concepts are important but also how they are related, which significantly improves generative performance and also naturally allows for injection of new and augmented data sources into generative AI algorithms. We find that the additional feature of relatedness provides advantages over regular retrieval augmentation approaches and not only improves LLM performance but also provides mechanistic insights for exploration of a material design process. Illustrated for a use case of relating distinct areas of knowledge, here, music and proteins, such strategies can also provide an interpretable graph structure with rich information at the node, edge, and subgraph level that provides specific insights into mechanisms and relationships. We discuss other approaches to improve generative qualities, including nonlinear sampling strategies and agent-based modeling that offer enhancements over single-shot generations, whereby LLMs are used to both generate content and assess content against an objective target. Examples provided include complex question answering, code generation, and execution in the context of automated force-field development from actively learned density functional theory (DFT) modeling and data analysis.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3777 - Buehler 2024
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking

Buehler, Markus J.

arXiv 2024;():

2024

Ref ID: 8716

PRefLexOR (Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning) combines preference optimization with concepts from Reinforcement Learning to enable models to self-teach through iterative reasoning improvements. We propose a recursive learning approach that engages the model in multi-step reasoning, revisiting, and refining intermediate steps before producing a final output in training and inference phases. Through multiple training stages, the model first learns to align its reasoning with accurate decision paths by optimizing the log odds between preferred and non-preferred responses. During this process, PRefLexOR builds a dynamic knowledge graph by generating questions from random text chunks and retrieval-augmentation to contextualize relevant details from the entire training corpus. In the second stage, preference optimization enhances model performance by using rejection sampling to fine-tune reasoning quality by continually producing in-situ training data while masking the reasoning steps. Recursive optimization within a thinking token framework introduces iterative feedback loops, where the model refines reasoning, achieving deeper coherence, consistency, and adaptability. Implemented in small language models with only 3 billion parameters, we should that even tiny models can iteratively teach themselves to reason with greater depth and reflectivity. Our implementation is straightforward and can be incorporated into any existing pretrained LLM. We focus our examples on applications in biological materials science and demonstrate the method in a variety of case studies that range from in-domain to cross-domain applications. Using reasoning strategies that include thinking and reflection modalities we build a multi-agent recursive self-improving inference approach to successively improve responses via repeated sampling in inference time.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#150 - Bui 2024
Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A Case Study at HCMUT

Bui, T.; Tran, O.; Nguyen, P.; Ho, B.; Nguyen, L.; Quan, T.; Assoc Computing, Machinery

1st Workshop on AI-powered Question Answering Systems for Multimedia (AIQAM) 2024;():36-41

Phuket, THAILAND Assoc Computing Machinery 2024

DOI: 10.1145/3643479.3662055 · Ref ID: 3013

In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2534 - Buluç 2013
High-Productivity and High-Performance Analysis of Filtered Semantic Graphs

Buluç, A.; Duriakova, E.; Fox, A.; Gilbert, J. R.; Kamil, S.; Lugowski, A.; Oliker, L.; Williams, S.

2013 IEEE 27th International Symposium on Parallel and Distributed Processing 2013;():237-248

2013

DOI: 10.1109/IPDPS.2013.52 · Ref ID: 6337

High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry attributes of various types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the computation must view the graph through a filter that passes only those individual vertices and edges of interest. Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, is customizable in two ways. First, the user can write custom graph algorithms by specifying operations between edges and vertices. These programmer-specified operations are called semiring operations due to KDT's underlying linear-algebraic abstractions. Second, the user can customize existing graph algorithms by writing filters that return true for those vertices and edges the user wants to retain during algorithm execution. For high productivity, both semiring operations and filters are written in a high-level language, resulting in relatively low performance due to the bottleneck of having to call into the Python virtual machine for each vertex and edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate semiring operations and filters defined by programmers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by comparing it with the high-performance Combinatorial BLAS engine, and show our approach enables users to write in high-level languages and still obtain the high performance of low-level code. We also present a new roofline model for graph traversals, and show that our high-performance implementations do not significantly deviate from the roofline. Overall, we demonstrate the first known solution to the problem of obtaining high performance from a productivity language when applying graph algorithms selectively on semantic graphs.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2579 - Bunte 2016
Integrating semantics for diagnosis of manufacturing systems

Bunte, A.; Diedrich, A.; Niggemann, O.

2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA) 2016;():1-8

2016

DOI: 10.1109/ETFA.2016.7733721 · Ref ID: 6859

Trends in novel manufacturing systems lead to an increased level of data availability and smart usage of these data. Nowadays, many approaches are available to use the data, but because of an increased flexibility of the systems the interaction between machines and humans has become a challenge. Humans have to browse through a huge amount of data, need knowledge about the machine and underlying algorithms to interpret the results; they cannot use their known terms for communication, we call it the conceptual gap. The user should be enabled to communicate with the machine on a more abstract level and in a more natural way. Therefore, a natural language layer is introduced to provide users with a familiar interaction interface. Underlying layers contain knowledge about the domain, the machines and how data can be accessed and processed. This enables users' questions such as “Are there any anomalies in the system?” to be answered. Answers are provided in natural language and evaluated with a test set of 204 questions.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1589 - Buongiorno 2024
Leveraging Gaming to Enhance Knowledge Graphs for Explainable Generative AI Applications

Buongiorno, S.; Clark, C.

IEEE Conference on Computatonal Intelligence and Games, CIG 2024;():

IEEE Computer Society 2024

DOI: 10.1109/CoG60054.2024.10645673 · Ref ID: 4402

External knowledge graphs (KGs) can be used to augment large language models (LLMs), while simultaneously providing an explainable knowledge base of facts that can be inspected by a human. This approach may be particularly valuable in domains where explainability is critical, like human trafficking data analysis. However, creating KGs can pose challenges. KGs parsed from documents may comprise explicit connections (those directly stated by a document) but miss implicit connections (those obvious to a human although not directly stated). To address these challenges, this preliminary research introduces the GAME-KG framework, standing for 'Gaming for Augmenting Metadata and Enhancing Knowledge Graphs.' GAME-KG is a federated approach to modifying explicit as well as implicit connections in KGs by using crowdsourced feedback collected through video games. GAME-KG is shown through two demonstrations: a Unity test scenario from Dark Shadows, a video game that collects feedback on KGs parsed from US Department of Justice (DOJ) Press Releases on human trafficking, and a following experiment where OpenAI's GPT-4 is prompted to answer questions based on a modified and unmodified KG. Initial results suggest that GAME-KG can be an effective framework for enhancing KGs while simultaneously providing an explainable set of structured facts verified by humans. © 2024 IEEE.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2504 - Buyko 2011
Generating Semantics for the Life Sciences via Text Analytics

Buyko, E.; Hahn, U.

2011 IEEE Fifth International Conference on Semantic Computing 2011;():193-196

2011

DOI: 10.1109/ICSC.2011.75 · Ref ID: 6227

The life sciences have a strong need for carefully curated, semantically rich fact repositories. Knowledge harvesting from unstructured textual sources is currently performed by highly skilled curators who manually feed semantics into such databases as a result of deep understanding of the documents chosen to populate such repositories. As this is a slow and costly process, we here advocate an automatic approach to the generation of database contents which is based on JREX, a high performance relation extraction system. As a real-life example, we target REGULONDB, the world's largest manually curated reference database for the transcriptional regulation network of E. coli. We investigate in our study the performance of automatic knowledge capture from various literature sources, such as PUBMED abstracts and associated full text articles. Our results show that we can, indeed, automatically re-create a considerable portion of the REGULONDB database by processing the relevant literature sources. Hence, this approach might help curators widen the knowledge acquisition bottleneck in this field.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1008 - Buzzega 2023
Automated Knowledge Graph Completion for Natural Language Understanding: Known Paths and Future Directions

Buzzega, G.; Guidetti, V.; Mandreoli, F.; Mariotti, L.; Belli, A.; Lombardi, P.

CEUR Workshop Proceedings 2023;3478():160-172

CEUR-WS 2023

Ref ID: 5248

Knowledge Graphs (KGs) are large collections of structured data that can model real world knowledge and are important assets for the companies that employ them. KGs are usually constructed iteratively and often show a sparse structure. Also, as knowledge evolves, KGs must be updated and completed. Many automatic methods for KG Completion (KGC) have been proposed in the literature to reduce the costs associated with manual maintenance. Motivated by an industrial case study aiming to enrich a KG specifically designed for Natural Language Understanding tasks, this paper presents an overview of classical and modern deep learning completion methods. In particular, we delve into Large Language Models (LLMs), which are the most promising deep learning architectures. We show that their applications to KGC are affected by several shortcomings, namely they neglect the structure of KG and treat KGC as a classification problem. Such limitations, together with the brittleness of the LLMs themselves, stress the need to create KGC solutions at the interface between symbolic and neural approaches and lead to the way ahead for future research in intelligible corpus-based KGC. © 2023 CEUR-WS. All rights reserved.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#129 - Cadeddu 2024
A comparative analysis of knowledge injection strategies for large language models in the domain

Cadeddu, A.; Chessa, A.; De Leo, V.; Fenu, G.; Motta, E.; Osborne, F.; Recupero, D. R.; Salatino, A.; Secchi, L.

Eng. Appl. Artif. Intell. 2024;133():13

2024

DOI: 10.1016/j.engappai.2024.108166 · Ref ID: 3656

In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1106 - Cadeddu 2024
A comparative analysis of knowledge injection strategies for large language models in the scholarly domain

Cadeddu, A.; Chessa, A.; De Leo, V.; Fenu, G.; Motta, E.; Osborne, F.; Reforgiato Recupero, D.; Salatino, A.; Secchi, L.

Eng Appl Artif Intell 2024;133():

2024

DOI: 10.1016/j.engappai.2024.108166 · Ref ID: 3953

In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers. © 2024

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#196 - Cai 2024
Editing Knowledge Representation of Language Model via Rephrased Prefix Prompts

Cai, Y. C.; Cao, D.; Guo, R. X.; Wen, Y. Q.; Liu, G. Q.; Chen, E. H.; Zhang, J. Y.

20th International Conference on Intelligent Computing (ICIC) 2024;14878():459-470

Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024

DOI: 10.1007/978-981-97-5672-8_39 · Ref ID: 3658

Neural language models (LMs) have been extensively trained on vast corpora to store factual knowledge about various aspects of the world described in texts. Current technologies typically employ knowledge editing methods or specific prompts to modify LM outputs. However, existing knowledge editing methods are costly and inefficient, struggling to produce appropriate text. Additionally, prompt engineering is opaque and requires significant effort to find suitable prompts. To address these issues, we introduce a new method called PSPEM (Prefix Soft-Prompt Editing Method), that can be used for a lifetime with just one training. It resolves the inefficiencies and generalizability issues in knowledge editing methods and overcomes the opacity of prompt engineering by automatically seeking optimal soft prompts. Specifically, PSPEM adopts a prompt encoder and an encoding converter to compress and refine key information in prompts and adopts prompt alignment techniques to guide model generation, ensuring text consistency and adherence to the intended structure and content. We have validated the effectiveness of PSPEM through knowledge editing and attribute inserting. On the COUNTERFACT dataset, PSPEM achieved nearly 100% editing accuracy and demonstrated the highest level of fluency. We further analyzed the similarities between PSPEM and original prompts and their impact on the model's internals. The results indicate that PSPEM can serve as an alternative to original prompts, supporting the model in effective editing.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2037 - Calixto 2021
Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

Calixto, I.; Raganato, A.; Pasini, T.

NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 2021;():3651-3661

Association for Computational Linguistics (ACL) 2021

Ref ID: 5700

Masked language models have quickly become the de facto standard when processing text. Recently, several approaches have been proposed to further enrich word representations with external knowledge sources such as knowledge graphs. However, these models are devised and evaluated in a monolingual setting only. In this work, we propose a language-independent entity prediction task as an intermediate training procedure to ground word representations on entity semantics and bridge the gap across different languages by means of a shared vocabulary of entities. We show that our approach effectively injects new lexical-semantic knowledge into neural models, improving their performance on different semantic tasks in the zero-shot crosslingual setting. As an additional advantage, our intermediate training does not require any supplementary input, allowing our models to be applied to new datasets right away. In our experiments, we use Wikipedia articles in up to 100 languages and already observe consistent gains compared to strong baselines when predicting entities using only the English Wikipedia. Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages. © 2021 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3062 - Calvagna 2023
Using Knowledge Awareness to Improve Safety of Autonomous Driving

Calvagna, A.; Ghosh, A.; Soudjnai, S.

2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2023;():2997-3002

2023

DOI: 10.1109/SMC53992.2023.10394593 · Ref ID: 6916

We present a method, which incorporates knowledge awareness into the symbolic computation of discrete controllers for reactive cyber physical systems, to improve decision making about the unknown operating environment under uncertain/incomplete inputs. Assuming an abstract model of the system and the environment, we translate the knowledge awareness of the operating context into linear temporal logic formulas and incorporate them into the system specifications to synthesize a controller. The knowledge base is built upon an ontology model of the environment objects and behavioural rules, which includes also symbolic models of partial input features. The resulting symbolic controller support smoother, early reactions, which improves the security of the system over existing approaches based on incremental symbolic perception. A motion planning case study for an autonomous vehicle has been implemented to validate the approach, and presented results show significant improvements with respect to safety of state-of-the-art symbolic controllers for reactive systems.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1821 - Cao 2024
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models

Cao, B.; Tang, Q.; Lin, H.; Jiang, S.; Dong, B.; Han, X.; Chen, J.; Wang, T.; Sun, L.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():14016-14036

European Language Resources Association (ELRA) 2024

Ref ID: 4655

Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities. In recent years, large-scale pre-trained language models have shown remarkable memorizing ability. On the contrary, vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. To investigate such a retentive-forgetful contradiction and understand the memorizing dynamic mechanism of language models, we conduct thorough experiments by controlling the target knowledge types, the learning strategies and the learning schedules. We find that: 1) Vanilla language models without pre-training are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation. These conclusions are useful for understanding the abilities of pre-trained language models and shed light on designing and evaluating new learning and inference algorithms of language models. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#477 - Cao 2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Cao, B. X.; Lin, H. Y.; Han, X. P.; Sun, L.; Yan, L. Y.; Liao, M.; Xue, T.; Xu, J.; Assoc Computat, Linguist

Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():1860-1874

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 3621

Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases(1).

Xinchen voted
Davis voted
Final decision
What was the agreed final decision?

#3212 - Cao 2024
AutoRD: An Automatic and End-to-End System for Rare Disease Knowledge Graph Construction Based on Ontologies-enhanced Large Language Models

Cao, Lang; Sun, Jimeng; Cross, Adam

arXiv 2024;():

2024

Ref ID: 8151

Rare diseases affect millions worldwide but often face limited research focus due to their low prevalence. This results in prolonged diagnoses and a lack of approved therapies. Recent advancements in Large Language Models (LLMs) have shown promise in automating the extraction of medical information, offering potential to improve medical diagnosis and management. However, most LLMs lack professional medical knowledge, especially concerning rare diseases, and struggle to handle the latest rare disease information. They also cannot effectively manage rare disease data and are not directly suitable for diagnosis and management tasks. Our objective is to create an end-to-end system called AutoRD, which automates the extraction of information from medical texts about rare diseases, focusing on entities and their relations. AutoRD integrates up-to-date structured knowledge and demonstrates superior performance in rare disease extraction tasks. We conduct various experiments to evaluate AutoRD's performance, aiming to surpass common LLMs and traditional methods.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#722 - Cao 2024
Research on Large Language Model for Coal Mine Equipment Maintenance Based on Multi-Source Text

Cao, X. G.; Xu, W. T.; Zhao, J. B.; Duan, Y.; Yang, X.

Appl. Sci.-Basel 2024;14(7):16

2024

DOI: 10.3390/app14072946 · Ref ID: 3189

The efficient management and utilization of coal mine equipment maintenance knowledge is an indispensable foundation for advancing the establishment of intelligent mines. This knowledge has problems such as scattered, low sharing, and insufficient management, which restricts the development of coal mine intelligence. For the above-mentioned problems, a large language model for the maintenance of coal mine equipment based on multi-source text (XCoalChat) was proposed to better manage and utilize the existing massive knowledge of coal mine equipment maintenance. The dataset of coal mine equipment maintenance based on ReliableCEMK-Self-Instruction was constructed to obtain a wide and diverse amount of knowledge through sample generation. Aiming at the illusory problem of the large language model, a knowledge graph enhancement method based on the "Coal Mine Equipment Maintenance System-Full Life Cycle-Specification" was proposed to improve the knowledge density. A triple-LoRA fine-tuning mechanism and DPO direct preference optimization method were introduced into the top of the baseline model, which guarantees that XCoalChat can handle multiple Q&A and maintenance decision analysis tasks with limited computing power. Compared with ChatGLM, Bloom, and LLama, the comprehensive assessment of XCoalChat was performed by experiments including coal mine dialog consulting, coal mine professional consulting, and maintenance decision analysis. The results showed that XCoalChat achieved the best response accuracy in professional consulting and maintenance decision analysis; XCoalChat also took the least reasoning time on average. XCoalChat outperformed other mainstream large language models, which verify that XCoalChat is an effective large language model in the field of coal mine equipment maintenance.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#840 - Carta 2024
Towards Zero-shot Knowledge Graph building: Automated Schema Inference

Carta, S.; Giuliani, A.; Manca, M. M.; Piano, L.; Tiddia, S. G.; Acm

32nd ACM Conference on User Modeling, Adaptation and Personalization (ACM UMAP) 2024;():467-473

Cagliari, ITALY Assoc Computing Machinery 2024

DOI: 10.1145/3631700.3665234 · Ref ID: 3324

In the current Digital Transformation scenario, Knowledge Graphs are essential for comprehending, representing, and exploiting complex information in a structured form. The main paradigm in automatically generating proper Knowledge Graphs relies on predefined schemas or ontologies. Such schemas are typically manually constructed, requiring an intensive human effort, and are often sensitive to information loss due to negligence, incomplete analysis, or human subjectivity or inclination. Limiting human bias and the resulting information loss in creating proper Knowledge Graphs is paramount, particularly for user modeling in various sectors, such as education or healthcare. To this end, we propose a novel approach to automatically generating a proper entity schema. The devised methodology combines the language understanding capabilities of LLM with classical machine learning methods such as clustering to properly build an entity schema from a set of documents. This solution eliminates the need for human intervention and fosters a more efficient and comprehensive knowledge representation. The assessment of our proposal concerns adopting a state-of-the-art entity extraction model ( UniNER) to estimate the relevance of the extracted entities based on the generated schema. Results confirm the potential of our approach, as we observed a negligible difference between the topic similarity score obtained with the ground truth and with the automatically generated schema (less than 1% on average on three different datasets). Such an outcome confirms that the proposed approach may be valuable in automatically creating an entity schema from a set of documents.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3554 - Carta 2023
Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction

Carta, Salvatore; Giuliani, Alessandro; Piano, Leonardo; Podda, Alessandro Sebastian; Pompianu, Livio; Tiddia, Sandro Gabriele

arXiv 2023;():

2023

Ref ID: 7771

In the current digitalization era, capturing and effectively representing knowledge is crucial in most real-world scenarios. In this context, knowledge graphs represent a potent tool for retrieving and organizing a vast amount of information in a properly interconnected and interpretable structure. However, their generation is still challenging and often requires considerable human effort and domain expertise, hampering the scalability and flexibility across different application fields. This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models, such as GPT-3.5, that can address all the main critical issues in knowledge graph building. The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies in the main stages of the generation process. Our unique manifold approach may encompass significant benefits to the scientific community. In particular, the main contribution can be summarized by: (i) an innovative strategy for iteratively prompting large language models to extract relevant components of the final graph; (ii) a zero-shot strategy for each prompt, meaning that there is no need for providing examples for "guiding" the prompt result; (iii) a scalable solution, as the adoption of LLMs avoids the need for any external resources or human expertise. To assess the effectiveness of our proposed model, we performed experiments on a dataset that covered a specific domain. We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#798 - Castell-Díaz 2023
Supporting SNOMED CT postcoordination with knowledge graph embeddings

Castell-Díaz, J.; Miñarro-Giménez, J. A.; Martínez-Costa, C.

J. Biomed. Inform. 2023;139():10

2023

DOI: 10.1016/j.jbi.2023.104297 · Ref ID: 3140

SNOMED CT postcoordination is an underused mechanism that can help to implement advanced systems for the automatic extraction and encoding of clinical information from text. It allows defining non-existing SNOMED CT concepts by their relationships with existing ones. Manually building postcoordinated expressions is a difficult task. It requires a deep knowledge of the terminology and the support of specialized tools that barely exist. In order to support the building of postcoordinated expressions, we have implemented KGE4SCT: a method that suggests the corresponding SNOMED CT postcoordinated expression for a given clinical term. We leverage on the SNOMED CT ontology and its graph-like structure and use knowledge graph embeddings (KGEs). The objective of such embeddings is to represent in a vector space knowledge graph components (e.g. entities and relations) in a way that captures the structure of the graph. Then, we use vector similarity and analogies for obtaining the postcoordinated expression of a given clinical term. We obtained a semantic type accuracy of 98%, relationship accuracy of 90%, and analogy accuracy of 60%, with an overall completeness of postcoordination of 52% for the Spanish SNOMED CT version. We have also applied it to the English SNOMED CT version and outperformed state of the art methods in both, corpus generation for language model training for this task (improvement of 6% for analogy accuracy), and automatic postcoordination of SNOMED CT expressions, with an increase of 17% for partial conversion rate.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#792 - Caufield 2024
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning

Caufield, J. H.; Hegde, H.; Emonet, V.; Harris, N. L.; Joachimiak, M. P.; Matentzoglu, N.; Kim, H.; Moxon, S.; Reese, J. T.; Haendel, M. A.; Robinson, P. N.; Mungall, C. J.

Bioinformatics 2024;40(3):10

2024

DOI: 10.1093/bioinformatics/btae104 · Ref ID: 3780

Motivation Creating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas.Results Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM.Availability and implementation SPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#172 - Celik 2023
Developmental Scaffolding with Large Language Models

Celik, B.; Ahmetoglu, A.; Ugur, E.; Oztop, E.; Ieee

IEEE International Conference on Development and Learning (ICDL) 2023;():396-402

Macau, PEOPLES R CHINA Ieee 2023

DOI: 10.1109/icdl55364.2023.10364374 · Ref ID: 3634

Exploration and self-observation are key mechanisms of infant sensorimotor development. These processes are further guided by parental scaffolding to accelerate skill and knowledge acquisition. In developmental robotics, this approach has been adopted often by having a human acting as the source of scaffolding. In this study, we investigate whether Large Language Models (LLMs) can act as a scaffolding agent for a robotic system that aims to learn to predict the effects of its actions. To this end, an object manipulation setup is considered where one object can be picked and placed on top of or in the vicinity of another object. The adopted LLM is asked to guide the action selection process through algorithmically generated state descriptions and action selection alternatives in natural language. The simulation experiments that include cubes in this setup show that LLM-guided (GPT3.5-guided) learning yields significantly faster discovery of novel structures compared to random exploration. However, we observed that GPT3.5 fails to effectively guide the robot in generating structures with different affordances such as cubes and spheres. Overall, we conclude that even without fine-tuning, LLMs may serve as a moderate scaffolding agent for improving robot learning, however, they still lack affordance understanding which limits the applicability of the current LLMs in robotic scaffolding tasks.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#289 - Cenikj 2023
From language models to large-scale food and biomedical knowledge graphs

Cenikj, G.; Strojnik, L.; Angelski, R.; Ogrinc, N.; Seljak, B. K.; Eftimov, T.

Sci Rep 2023;13(1):14

2023

DOI: 10.1038/s41598-023-34981-4 · Ref ID: 3422

Knowledge about the interactions between dietary and biomedical factors is scattered throughout uncountable research articles in an unstructured form (e.g., text, images, etc.) and requires automatic structuring so that it can be provided to medical professionals in a suitable format. Various biomedical knowledge graphs exist, however, they require further extension with relations between food and biomedical entities. In this study, we evaluate the performance of three state-of-the-art relation-mining pipelines (FooDis, FoodChem and ChemDis) which extract relations between food, chemical and disease entities from textual data. We perform two case studies, where relations were automatically extracted by the pipelines and validated by domain experts. The results show that the pipelines can extract relations with an average precision around 70%, making new discoveries available to domain experts with reduced human effort, since the domain experts should only evaluate the results, instead of finding, and reading all new scientific papers.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3151 - Chai 2024
RAVL: A Retrieval-Augmented Visual Language Model Framework for Knowledge-Based Visual Question Answering

Chai, Naiquan; Zou, Dongsheng; Liu, Jiyuan; Wang, Hao; Yang, Yuming; Song, Xinyi

Natural Language Processing and Chinese Computing: 13th National CCF Conference, NLPCC 2024, Hangzhou, China, November 1–3, 2024, Proceedings, Part III 2024;():394–406

Hangzhou, China Springer-Verlag 2024

DOI: 10.1007/978-981-97-9437-9_31 · Ref ID: 7142

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#740 - Chan 2021
SALKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning

Chan, A.; Xu, J. S.; Long, B. Y.; Sanyal, S.; Gupta, T.; Ren, X.

35th Annual Conference on Neural Information Processing Systems (NeurIPS) 2021;34():

Electr Network Neural Information Processing Systems (Nips) 2021

Ref ID: 3300

Augmenting pre-trained language models with knowledge graphs (KGs) has achieved success on various commonsense reasoning tasks. However, for a given task instance, the KG, or certain parts of the KG, may not be useful. Although KG-augmented models often use attention to focus on specific KG components, the KG is still always used, and the attention mechanism is never explicitly taught which KG components should be used. Meanwhile, saliency methods can measure how much a KG feature (e.g., graph, node, path) influences the model to make the correct prediction, thus explaining which KG features are useful. This paper explores how saliency explanations can be used to improve KG-augmented models' performance. First, we propose to create coarse (Is the KG useful?) and fine (Which nodes/paths in the KG are useful?) saliency explanations. Second, to motivate saliency-based supervision, we analyze oracle KG-augmented models which directly use saliency explanations as extra inputs for guiding their attention. Third, we propose SALKG, a framework for KG-augmented models to learn from coarse and/or fine saliency explanations. Given saliency explanations created from a task's training set, SALKG jointly trains the model to predict the explanations, then solve the task by attending to KG features highlighted by the predicted explanations. On three commonsense QA benchmarks (CSQA, OBQA, CODAH) and a range of KG-augmented models, we show that SALKG can yield considerable performance gains - up to 2.76% absolute improvement on CSQA. (2)

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3324 - Chan 2022
DeepTrust: A Reliable Financial Knowledge Retrieval Framework For Explaining Extreme Pricing Anomalies

Chan, Pok Wah

arXiv 2022;():

2022

Ref ID: 7527

Extreme pricing anomalies may occur unexpectedly without a trivial cause, and equity traders typically experience a meticulous process to source disparate information and analyze its reliability before integrating it into the trusted knowledge base. We introduce DeepTrust, a reliable financial knowledge retrieval framework on Twitter to explain extreme price moves at speed, while ensuring data veracity using state-of-the-art NLP techniques. Our proposed framework consists of three modules, specialized for anomaly detection, information retrieval and reliability assessment. The workflow starts with identifying anomalous asset price changes using machine learning models trained with historical pricing data, and retrieving correlated unstructured data from Twitter using enhanced queries with dynamic search conditions. DeepTrust extrapolates information reliability from tweet features, traces of generative language model, argumentation structure, subjectivity and sentiment signals, and refine a concise collection of credible tweets for market insights. The framework is evaluated on two self-annotated financial anomalies, i.e., Twitter and Facebook stock price on 29 and 30 April 2021. The optimal setup outperforms the baseline classifier by 7.75% and 15.77% on F0.5-scores, and 10.55% and 18.88% on precision, respectively, proving its capability in screening unreliable information precisely. At the same time, information retrieval and reliability assessment modules are analyzed individually on their effectiveness and causes of limitations, with identified subjective and objective factors that influence the performance. As a collaborative project with Refinitiv, this framework paves a promising path towards building a scalable commercial solution that assists traders to reach investment decisions on pricing anomalies with authenticated knowledge from social media platforms in real-time.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#361 - Chang 2021
Incorporating Domain Knowledge Into Language Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity: Model Development and Performance Comparison

Chang, D.; Lin, E.; Brandt, C.; Taylor, R. A.

JMIR Med. Inf. 2021;9(11):10

2021

DOI: 10.2196/23101 · Ref ID: 3495

Background: Although electronic health record systems have facilitated clinical documentation in health care, they have also introduced new challenges, such as the proliferation of redundant information through the use of copy and paste commands or templates. One approach to trimming down bloated clinical documentation and improving clinical summarization is to identify highly similar text snippets with the goal of removing such text. Objective: We developed a natural language processing system for the task of assessing clinical semantic textual similarity. The system assigns scores to pairs of clinical text snippets based on their clinical semantic similarity. Methods: We leveraged recent advances in natural language processing and graph representation learning to create a model that combines linguistic and domain knowledge information from the MedSTS data set to assess clinical semantic textual similarity. We used bidirectional encoder representation from transformers (BERT)-based models as text encoders for the sentence pairs in the data set and graph convolutional networks (GCNs) as graph encoders for corresponding concept graphs that were constructed based on the sentences. We also explored techniques, including data augmentation, ensembling, and knowledge distillation, to improve the model's performance, as measured by the Pearson correlation coefficient (r). Results: Fine-tuning the BERT_base and ClinicalBERT models on the MedSTS data set provided a strong baseline (Pearson correlation coefficients: 0.842 and 0.848, respectively) compared to those of the previous year's submissions. Our data augmentation techniques yielded moderate gains in performance, and adding a GCN-based graph encoder to incorporate the concept graphs also boosted performance, especially when the node features were initialized with pretrained knowledge graph embeddings of the concepts (r=0.868). As expected, ensembling improved performance, and performing multisource ensembling by using different language model variants, conducting knowledge distillation with the multisource ensemble model, and taking a final ensemble of the distilled models further improved the system's performance (Pearson correlation coefficients: 0.875, 0.878, and 0.882, respectively). Conclusions: This study presents a system for the MedSTS clinical semantic textual similarity benchmark task, which was created by combining BERT-based text encoders and GCN-based graph encoders in order to incorporate domain knowledge into the natural language processing pipeline. We also experimented with other techniques involving data augmentation, pretrained concept embeddings, ensembling, and knowledge distillation to further increase our system's performance. Although the task and its benchmark data set are in the early stages of development, this study, as well as the results of the competition, demonstrates the potential of modern language model-based systems to detect redundant information in clinical notes.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3319 - Chaudhary 2024
Decoding Intelligence: A Framework for Certifying Knowledge Comprehension in LLMs

Chaudhary, Isha; Jain, Vedaant V.; Singh, Gagandeep

arXiv 2024;():

2024

Ref ID: 8138

Knowledge comprehension capability is an important aspect of human intelligence. As Large Language Models (LLMs) are being envisioned as superhuman agents, it is crucial for them to be proficient at knowledge comprehension. However, existing benchmarking studies do not provide consistent, generalizable, and formal guarantees on the knowledge comprehension capabilities of LLMs. In this work, we propose the first framework to certify knowledge comprehension in LLMs with formal probabilistic guarantees. Our certificates are quantitative – they consist of high-confidence, tight bounds on the probability that a target LLM gives the correct answer on any knowledge comprehension prompt sampled from a distribution. We design and certify novel specifications that precisely represent distributions of knowledge comprehension prompts leveraging knowledge graphs. We certify SOTA LLMs for specifications over the Wikidata5m knowledge graph. We find that the knowledge comprehension capability improves significantly with scaling the size of the models.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1375 - Che 2024
A Hierarchical Context Augmentation Method to Improve Retrieval-Augmented LLMs on Scientific Papers

Che, T. Y.; Mao, X. L.; Lan, T.; Huang, H.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():243-254

Association for Computing Machinery 2024

DOI: 10.1145/3637528.3671847 · Ref ID: 3957

Scientific papers of a large scale on the Internet encompass a wealth of data and knowledge, attracting the attention of numerous researchers. To fully utilize these knowledge, Retrieval-Augmented Large Language Models (LLMs) usually leverage large-scale scientific corpus to train and then retrieve relevant passages from external memory to improve generation, which have demonstrated outstanding performance. However, existing methods can only capture one-dimension fragmented textual information without incorporating hierarchical structural knowledge, eg. the deduction relationship of abstract and main body, which makes it difficult to grasp the central thought of papers. To tackle this problem, we propose a hierarchical context augmentation method, which helps Retrieval-Augmented LLMs to autoregressively learn the structure knowledge of scientific papers. Specifically, we utilize the document tree to represent the hierarchical relationship of a paper and enhance the structure information of scientific context from three aspects: scale, format and global information. First, we think each top-bottom path of document tree is a logical independent context, which can be used to largely increase the scale of extracted structural corpus. Second, we propose a novel label-based format to represent the structure of context in textual sequences, unified between training and inference. Third, we introduce the global information of retrieved passages to further enhance the structure of context. Extensive experiments on three scientific tasks show that the proposed method significantly improves the performance of Retrieval-Augmented LLMs on all tasks. Besides, our method achieves start-of-art performance in Question Answer task and outperforms ChatGPT. Moreover, it also brings considerate gains with irrelevant retrieval passages, illustrating its effectiveness on practical application scenarios. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2201 - Chellagurki 2024
Biomedical Relation Extraction Using LLMs and Knowledge Graphs

Chellagurki, P.; Kumaru, S. P. Kumar; Peela, R. R.; Yeluri, N.; Rojas, C.; Jetcheva, J.

2024 IEEE 10th International Conference on Big Data Computing Service and Machine Learning Applications (BigDataService) 2024;():60-69

2024

DOI: 10.1109/BigDataService62917.2024.00015 · Ref ID: 7068

Due to the rapid growth of research papers on biomedical topics, it has become increasingly important to make advancements in biomedical Natural Language Processing (NLP). Biomedical NLP enables us to extract important information from text, such as new insights into the role of different genes in disease susceptibility, or the potential for drug therapies that are effective against one disease to work effectively against another. In this paper, we present a comparative evaluation of the binary relation classification capabilities of the current state-of-the-art binary relation classifier, BioBERT, against recently released open-source large language models, Gemma-2b, Gemma-7b, and Llama2-7b, which we fine-tune with the benchmark GAD and EU-ADR datasets. In addition, we quantify the potential of discovering new relationships by utilizing knowledge graphs built out of known binary relations.

Xinchen voted
Mike voted
Final decision
What was the agreed final decision?

#1417 - Chen 2024
INSIDE: LLMS' INTERNAL STATES RETAIN THE POWER OF HALLUCINATION DETECTION

Chen, C.; Liu, K.; Chen, Z.; Gu, Y.; Wu, Y.; Tao, M.; Fu, Z.; Ye, J.

12th International Conference on Learning Representations, ICLR 2024 2024;():

International Conference on Learning Representations, ICLR 2024

Ref ID: 4689

Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs. Previous efforts in detecting hallucinations have been employed at logit-level uncertainty estimation or language-level self-consistency evaluation, where the semantic information is inevitably lost during the token-decoding procedure. Thus, we propose to explore the dense semantic information retained within LLMs' INternal States for hallucInation DEtection (INSIDE). In particular, a simple yet effective EigenScore metric is proposed to better evaluate responses' self-consistency, which exploits the eigenvalues of responses' covariance matrix to measure the semantic consistency/diversity in the dense embedding space. Furthermore, from the perspective of self-consistent hallucination detection, a test time feature clipping approach is explored to truncate extreme activations in the internal states, which reduces overconfident generations and potentially benefits the detection of overconfident hallucinations. Extensive experiments and ablation studies are performed on several popular LLMs and question-answering (QA) benchmarks, showing the effectiveness of our proposal. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1846 - Chen 2024
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs

Chen, H.; Shen, X.; Lv, Q.; Wang, J.; Ni, X.; Ye, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4345-4360

Association for Computational Linguistics (ACL) 2024

Ref ID: 4371

Knowledge graphs (KGs) play a pivotal role in knowledge-intensive tasks across specialized domains, where the acquisition of precise and dependable knowledge is crucial. However, existing KG construction methods heavily rely on human intervention to attain qualified KGs, which severely hinders the practical applicability in real-world scenarios. To address this challenge, we propose a general KG construction framework, named SAC-KG, to exploit large language models (LLMs) as Skilled Automatic Constructors for domain Knowledge Graph. SAC-KG effectively involves LLMs as domain experts to generate specialized and precise multi-level KGs. Specifically, SAC-KG consists of three components: Generator, Verifier, and Pruner. For a given entity, Generator produces its relations and tails from raw domain corpora, to construct a specialized single-level KG. Verifier and Pruner then work together to ensure precision by correcting generation errors and determining whether newly produced tails require further iteration for the next-level KG. Experiments demonstrate that SAC-KG automatically constructs a domain KG at the scale of over one million nodes and achieves a precision of 89.32%, leading to a superior performance with over 20% increase in precision rate compared to existing state-of-the-art methods for the KG construction task. © 2024 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3623 - Chen 2023
Large Knowledge Model: Perspectives and Challenges

Chen, Huajun

arXiv 2023;():

2023

Ref ID: 7970

Humankind's understanding of the world is fundamentally linked to our perception and cognition, with \emph{human languages} serving as one of the major carriers of \emph{world knowledge}. In this vein, \emph{Large Language Models} (LLMs) like ChatGPT epitomize the pre-training of extensive, sequence-based world knowledge into neural networks, facilitating the processing and manipulation of this knowledge in a parametric space. This article explores large models through the lens of "knowledge". We initially investigate the role of symbolic knowledge such as Knowledge Graphs (KGs) in enhancing LLMs, covering aspects like knowledge-augmented language model, structure-inducing pre-training, knowledgeable prompts, structured CoT, knowledge editing, semantic tools for LLM and knowledgeable AI agents. Subsequently, we examine how LLMs can boost traditional symbolic knowledge bases, encompassing aspects like using LLM as KG builder and controller, structured knowledge pretraining, and LLM-enhanced symbolic reasoning. Considering the intricate nature of human knowledge, we advocate for the creation of \emph{Large Knowledge Models} (LKM), specifically engineered to manage diversified spectrum of knowledge structures. This promising undertaking would entail several key challenges, such as disentangling knowledge base from language models, cognitive alignment with human knowledge, integration of perception and cognition, and building large commonsense models for interacting with physical world, among others. We finally propose a five-"A" principle to distinguish the concept of LKM.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#141 - Chen 2023
Contextual semantic embeddings for ontology subsumption prediction

Chen, J. Y.; He, Y.; Geng, Y. X.; Jiménez-Ruiz, E.; Dong, H.; Horrocks, I.

World Wide Web 2023;26(5):2569-2591

2023

DOI: 10.1007/s11280-023-01169-9 · Ref ID: 3457

Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but the relevant research is still preliminary especially for expressive ontologies in Web Ontology Language (OWL). In this paper, we present a new subsumption prediction method named BERTSubs for classes of OWL ontology. It exploits the pre-trained language model BERT to compute contextual embeddings of a class, where customized templates are proposed to incorporate the class context (e.g., neighbouring classes) and the logical existential restriction. BERTSubs is able to predict multiple kinds of subsumers including named classes from the same ontology or another ontology, and existential restrictions from the same ontology. Extensive evaluation on five real-world ontologies for three different subsumption tasks has shown the effectiveness of the templates and that BERTSubs can dramatically outperform the baselines that use (literal-aware) knowledge graph embeddings, non-contextual word embeddings and the state-of-the-art OWL ontology embeddings.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1609 - Chen 2024
LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery

Chen, K.; Du, Y.; You, T.; Islam, M.; Guo, Z.; Jin, Y.; Chen, G.; Heng, P. A.

Proceedings - IEEE International Conference on Robotics and Automation 2024;():10772-10778

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICRA57147.2024.10610603 · Ref ID: 4475

Visual question answering (VQA) can be fundamentally crucial for promoting robotic-assisted surgical education. In practice, the needs of trainees are constantly evolving, such as learning more surgical types and adapting to new surgical instruments/techniques. Therefore, continually updating the VQA system by a sequential data stream from multiple resources is demanded in robotic surgery to address new tasks. In surgical scenarios, the privacy issue of patient data often restricts the availability of old data when updating the model, necessitating an exemplar-free continual learning (CL) setup. However, prior studies overlooked two vital problems of the surgical domain: i) large domain shifts from diverse surgical operations collected from multiple departments or clinical centers, and ii) severe data imbalance arising from the uneven presence of surgical instruments or activities during surgical procedures. This paper proposes to address these two problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology. We first develop a new multi-teacher CL framework that leverages a multimodal LLM as the additional teacher. The strong generalization ability of the LLM can bridge the knowledge gap when domain shifts and data imbalances occur. We then put forth a novel data processing method that transforms complex LLM embeddings into logits compatible with our CL framework. We also design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of the old CL model. Finally, we construct a new dataset for surgical VQA tasks. Extensive experimental results demonstrate the superiority of our method to other advanced CL models. © 2024 IEEE.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#371 - Chen 2024
Information Extraction of Aviation Accident Causation Knowledge Graph: An LLM-Based Approach

Chen, L.; Xu, J. H.; Wu, T. Y.; Liu, J.

Electronics 2024;13(19):21

2024

DOI: 10.3390/electronics13193936 · Ref ID: 2947

Summarizing the causation of aviation accidents is conducive to enhancing aviation safety. The knowledge graph of aviation accident causation, constructed based on aviation accident reports, can assist in analyzing the causes of aviation accidents. With the continuous development of artificial intelligence technology, leveraging large language models for information extraction and knowledge graph construction has demonstrated significant advantages. This paper proposes an information extraction method for aviation accident causation based on Claude-prompt, which relies on the large-scale pre-trained language model Claude 3.5. Through prompt engineering, combined with a few-shot learning strategy and a self-judgment mechanism, this method achieves automatic extraction of accident-cause entities and their relationships. Experimental results indicate that this approach effectively improves the accuracy of information extraction, overcoming the limitations of traditional methods in terms of accuracy and efficiency in processing complex texts. It provides strong support for subsequently constructing a structured knowledge graph of aviation accident causation and conducting causation analysis of aviation accidents.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1409 - Chen 2021
Incorporating Domain Knowledge into Language Transformers for Multi-Label Classification of Chinese Medical Questions

Chen, P. H.; Zeng, Y. X.; Lee, L. H.

ROCLING 2021 - Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing 2021;():265-270

The Association for Computational Linguistics and Chinese Language Processing (ACLCLP) 2021

Ref ID: 5658

In this paper, we propose a knowledge infusion mechanism to incorporate domain knowledge into language transformers. Weakly supervised data is regarded as the main source for knowledge acquisition. We pre-train the language models to capture masked knowledge of focuses and aspects and then fine-tune them to obtain better performance on the downstream tasks. Due to the lack of publicly available datasets for multi-label classification of Chinese medical questions, we crawled questions from medical question/answer forums and manually annotated them using eight predefined classes: persons and organizations, symptom, cause, examination, disease, information, ingredient, and treatment. Finally, a total of 1,814 questions with 2,340 labels. Each question contains an average of 1.29 labels. We used Baidu Medical Encyclopedia as the knowledge resource. Two transformers BERT and RoBERTa were implemented to compare performance on our constructed datasets. Experimental results showed that our proposed model with knowledge infusion mechanism can achieve better performance, no matter which evaluation metric including Macro F1, Micro F1, Weighted F1 or Subset Accuracy were considered. © 2021 ROCLING 2021 - Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing. All rights reserved.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3184 - Chen 2024
Apollonion: Profile-centric Dialog Agent

Chen, Shangyu; Zhao, Zibo; Zhao, Yuanyuan; Li, Xiang

arXiv 2024;():

2024

Ref ID: 8230

The emergence of Large Language Models (LLMs) has innovated the development of dialog agents. Specially, a well-trained LLM, as a central process unit, is capable of providing fluent and reasonable response for user's request. Besides, auxiliary tools such as external knowledge retrieval, personalized character for vivid response, short/long-term memory for ultra long context management are developed, completing the usage experience for LLM-based dialog agents. However, the above-mentioned techniques does not solve the issue of \textbf{personalization from user perspective}: agents response in a same fashion to different users, without consideration of their features, such as habits, interests and past experience. In another words, current implementation of dialog agents fail in ``knowing the user''. The capacity of well-description and representation of user is under development. In this work, we proposed a framework for dialog agent to incorporate user profiling (initialization, update): user's query and response is analyzed and organized into a structural user profile, which is latter served to provide personal and more precise response. Besides, we proposed a series of evaluation protocols for personalization: to what extend the response is personal to the different users. The framework is named as \method{}, inspired by inscription of ``Know Yourself'' in the temple of Apollo (also known as \method{}) in Ancient Greek. Few works have been conducted on incorporating personalization into LLM, \method{} is a pioneer work on guiding LLM's response to meet individuation via the application of dialog agents, with a set of evaluation methods for measurement in personalization.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3392 - Chen 2024
Entity Alignment with Noisy Annotations from Large Language Models

Chen, Shengyuan; Zhang, Qinggang; Dong, Junnan; Hua, Wen; Li, Qing; Huang, Xiao

arXiv 2024;():

2024

Ref ID: 8319

Entity alignment (EA) aims to merge two knowledge graphs (KGs) by identifying equivalent entity pairs. While existing methods heavily rely on human-generated labels, it is prohibitively expensive to incorporate cross-domain experts for annotation in real-world scenarios. The advent of Large Language Models (LLMs) presents new avenues for automating EA with annotations, inspired by their comprehensive capability to process semantic information. However, it is nontrivial to directly apply LLMs for EA since the annotation space in real-world KGs is large. LLMs could also generate noisy labels that may mislead the alignment. To this end, we propose a unified framework, LLM4EA, to effectively leverage LLMs for EA. Specifically, we design a novel active learning policy to significantly reduce the annotation space by prioritizing the most valuable entities based on the entire inter-KG and intra-KG structure. Moreover, we introduce an unsupervised label refiner to continuously enhance label accuracy through in-depth probabilistic reasoning. We iteratively optimize the policy based on the feedback from a base EA model. Extensive experiments demonstrate the advantages of LLM4EA on four benchmark datasets in terms of effectiveness, robustness, and efficiency. Codes are available via https://github.com/chensyCN/llm4ea_official.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#464 - Chen 2008
Knowledge sharing in virtual enterprises via an ontology-based access control approach

Chen, T. Y.

Comput. Ind. 2008;59(5):502-519

2008

DOI: 10.1016/j.compind.2007.12.004 · Ref ID: 3755

Collaborating throughout a product life cycle via virtual enterprise (VE) is one of the most promising strategies for enhancing global competitiveness. Efficient and secure knowledge sharing is critical to the success of a VE. This study presents a novel approach, model and technology for knowledge access control and sharing across enterprises. First, this study proposes an ontology-based knowledge sharing model and a multiple-layer knowledge representation framework on which a knowledge access control model for knowledge sharing in a VE is proposed. In the proposed model, user authorizations permitting access to knowledge in a VE are classified into two levels: (1) basic privileges and (2) extended privileges. The former is evaluated from four dimensions, i.e. who, what, when and where, while the latter is determined by considering how three domain ontologies, i.e., product, organization and activity, are related. This study then develops a knowledge access control policy (KACP) language model which is used to identify the knowledge access control and sharing rules of a VE and all its enterprise members. The knowledge access control model proposed in this study can facilitate VE Knowledge management and sharing across enterprises, enhance knowledge sharing security and flexibility and regulate knowledge sharing to expeditiously reflect changes in the business environment. (c) 2007 Elsevier B.V. All rights reserved.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1071 - Chen 2022
Chinese Machine Reading Comprehension Based on Language Model Containing Knowledge

Chen, W.; Fan, C.; Wu, Y.; Wang, Y.

ACM International Conference Proceeding Series 2022;():

Association for Computing Machinery 2022

DOI: 10.1145/3565387.3565405 · Ref ID: 5345

Machine reading comprehension (MRC) is a task that requires machines to answer relevant questions based on a given context. In recent years, it has attracted extensive attention with the development of deep learning and big data. Considering that human beings will associate some external relevant knowledge when understanding the text, researchers have proposed a method of introducing knowledge outside the given context to assist reading and this method is called Knowledge-Based Machine Reading Comprehension (KBMRC). However, the current research on this method is still scattered, and the retrieval and fusion of relevant knowledge are still two challenges in application, especially in Chinese MRC. The contribution of this paper mainly on the following three points: Firstly, in order to resolve the problem of related knowledge retrieval, we build up a related knowledge set. Secondly, in order to resolve the problem of related knowledge fusion, we propose a negative sample generation strategy and train a language model containing knowledge. Finally, a twin-tower fusion model is constructed based on this model. The experiments on Chinese reading comprehension dataset CMRC2018 show that our method has a certain improvement compared with the baseline method without external knowledge. © 2022 Association for Computing Machinery.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1783 - Chen 2024
RareBench: Can LLMs Serve as Rare Diseases Specialists?

Chen, X.; Mao, X.; Guo, Q.; Wang, L.; Zhang, S.; Chen, T.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():4850-4861

Association for Computing Machinery 2024

DOI: 10.1145/3637528.3671576 · Ref ID: 3993

Generalist Large Language Models (LLMs), such as GPT-4, have shown considerable promise in various domains, including medical diagnosis. Rare diseases, affecting approximately 300 million people worldwide, often have unsatisfactory clinical diagnosis rates primarily due to a lack of experienced physicians and the complexity of differentiating among many rare diseases. In this context, recent news such as "ChatGPT correctly diagnosed a 4-year-old's rare disease after 17 doctors failed"underscore LLMs' potential, yet underexplored, role in clinically diagnosing rare diseases. To bridge this research gap, we introduce RareBench, a pioneering benchmark designed to systematically evaluate the capabilities of LLMs on 4 critical dimensions within the realm of rare diseases. Meanwhile, we have compiled the largest open-source dataset on rare disease patients, establishing a benchmark for future studies in this domain. To facilitate differential diagnosis of rare diseases, we develop a dynamic few-shot prompt methodology, leveraging a comprehensive rare disease knowledge graph synthesized from multiple knowledge bases, significantly enhancing LLMs' diagnostic performance. Moreover, we present an exhaustive comparative study of GPT-4's diagnostic capabilities against those of specialist physicians. Our experimental findings underscore the promising potential of integrating LLMs into the clinical diagnostic process for rare diseases. This paves the way for exciting possibilities in future advancements in this field. © 2024 Copyright held by the owner/author(s).

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1823 - Chen 2024
Retrieval-Augmented Knowledge Integration into Language Models: A Survey

Chen, Y.; Röder, D.; Erker, J. J.; Hennig, L.; Thomas, P.; Möller, S.; Roller, R.

KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():45-63

Association for Computational Linguistics (ACL) 2024

Ref ID: 4251

This survey analyses how external knowledge can be integrated into language models in the context of retrieval-augmentation. The main goal of this work is to give an overview of: (1) Which external knowledge can be augmented? (2) Given a knowledge source, how to retrieve from it and then integrate the retrieved knowledge? To achieve this, we define and give a mathematical formulation of retrieval-augmented knowledge integration (RAKI). We discuss retrieval and integration techniques separately in detail, for each of the following knowledge formats: knowledge graph, tabular and natural language. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1937 - Chen 2025
Temporal Knowledge Graph Link Prediction Using Synergized Large Language Models and Temporal Knowledge Graphs

Chen, Y.; Shen, Y.

Communications in Computer and Information Science 2025;2183 CCIS():33-45

Springer Science and Business Media Deutschland GmbH 2025

DOI: 10.1007/978-981-97-7007-6_3 · Ref ID: 3875

Although large language models and temporal knowledge graphs each have significant advantages in the field of artificial intelligence, they also face certain challenges. However, through collaboration, large language models and temporal knowledge graphs can complement each other, addressing their respective shortcomings. This collaborative approach aims to harness the potential feasibility and practical effectiveness of large language models as external knowledge bases for temporal knowledge graph reasoning tasks. In our research, we have meticulously designed a synergized model that leverages the knowledge from the graph as prompts. The answers generated by the large language model undergo careful processing before being seamlessly incorporated into the training dataset. The ultimate goal is to significantly enhance the reasoning capabilities of temporal knowledge graphs. Experimental results underscore the positive impact of this synergized model on the completion tasks of temporal knowledge graphs, showcasing its potential to address gaps in knowledge and improve overall performance. While its influence on prediction tasks is relatively weak, the collaborative synergy demonstrates promising avenues for further exploration and development in the realm of AI research. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#2024 - Chen 2023
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation

Chen, Y.; Wang, X.; Li, M.; Hoiem, D.; Ji, H.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():13342-13357

Association for Computational Linguistics (ACL) 2023

Ref ID: 4940

State-of-the-art vision-language models (VLMs) still have limited performance in structural knowledge extraction, such as relations between objects. In this work, we present ViStruct, a training framework to learn VLMs for effective visual structural knowledge extraction. Two novel designs are incorporated. First, we propose to leverage the inherent structure of programming language to depict visual structural information. This approach enables explicit and consistent representation of visual structural information of multiple granularities, such as concepts relations, and events, in a well-organized structured format. Second, we introduce curriculum-based learning for VLMs to pro gressively comprehend visual structures, from fundamental visual concepts to intricate event structures. Our intuition is that lower-level knowledge may contribute to complex visual structure understanding. Furthermore, we compile and release a collection of datasets tailored for visual structural knowledge extraction. We adopt a weakly-supervised approach to directly generate visual event structures from captions for ViStruct training capitalizing on abundant image-caption pairs from the web. In experiments, we evaluate ViStruct on visual structure prediction tasks demonstrating its effectiveness in improving the understanding of visual structures. The code is public at https://github.com/Yangyi-Chen/vi-struct. ©2023 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2001 - Chen 2024
Uncertain Knowledge Graph Completion with Rule Mining

Chen, Y.; Wu, T.; Liu, Y.; Wang, Y.; Qi, G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14883 LNCS():100-112

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-981-97-7707-5_9 · Ref ID: 4245

To model the uncertainty within knowledge graphs (KGs), existing studies define uncertain knowledge graphs (UKGs), which assign a confidence score to each triple to measure its likelihood of being true and make more precisely downstream tasks such as reasoning and decision making possible. Since KGs usually suffer from the problem of incompleteness, methods of rule mining and reasoning for knowledge graph completion are extensively studied due to their excellent interpretability. However, previous methods are all conducted under deterministic scenarios, neglecting the uncertainty of knowledge, making them unable to be directly applied to UKGs. In this paper, we propose a new framework on uncertain knowledge graph completion with rule mining. The framework is composed of a rule mining model and a confidence prediction model. The rule mining model applies an encoder-decoder network transformer to take rule mining as a sequence-to-sequence task to generate rules. It models the uncertainty in UKGs and infer new triples by differentiable reasoning based on TensorLog with mined rules. The confidence prediction model uses a pre-trained language model to predict the triple confidence given the rules mined. Experiments show that our models significantly outperform various baselines in different evaluation metrics on link prediction and confidence prediction, respectively. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#3541 - Chen 2024
Integrating Multi-Head Convolutional Encoders with Cross-Attention for Improved SPARQL Query Translation

Chen, Yi-Hui; Lu, Eric Jui-Lin; Cheng, Kwan-Ho

arXiv 2024;():

2024

Ref ID: 8558

The main task of the KGQA system (Knowledge Graph Question Answering) is to convert user input questions into query syntax (such as SPARQL). With the rise of modern popular encoders and decoders like Transformer and ConvS2S, many scholars have shifted the research direction of SPARQL generation to the Neural Machine Translation (NMT) architecture or the generative AI field of Text-to-SPARQL. In NMT-based QA systems, the system treats knowledge base query syntax as a language. It uses NMT-based translation models to translate natural language questions into query syntax. Scholars use popular architectures equipped with cross-attention, such as Transformer, ConvS2S, and BiLSTM, to train translation models for query syntax. To achieve better query results, this paper improved the ConvS2S encoder and added multi-head attention from the Transformer, proposing a Multi-Head Conv encoder (MHC encoder) based on the n-gram language model. The principle is to use convolutional layers to capture local hidden features in the input sequence with different receptive fields, using multi-head attention to calculate dependencies between them. Ultimately, we found that the translation model based on the Multi-Head Conv encoder achieved better performance than other encoders, obtaining 76.52% and 83.37% BLEU-1 (BiLingual Evaluation Understudy) on the QALD-9 and LC-QuAD-1.0 datasets, respectively. Additionally, in the end-to-end system experiments on the QALD-9 and LC-QuAD-1.0 datasets, we achieved leading results over other KGQA systems, with Macro F1-measures reaching 52% and 66%, respectively. Moreover, the experimental results show that with limited computational resources, if one possesses an excellent encoder-decoder architecture and cross-attention, experts and scholars can achieve outstanding performance equivalent to large pre-trained models using only general embeddings.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3649 - Chen 2024
Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law

Chen, Yongming; Chen, Miner; Zhu, Ye; Pei, Juan; Chen, Siyu; Zhou, Yu; Wang, Yi; Zhou, Yifan; Li, Hao; Zhang, Songan

arXiv 2024;():

2024

Ref ID: 8669

Court efficiency is vital for social stability. However, in most countries around the world, the grassroots courts face case backlogs, with decisions relying heavily on judicial personnel's cognitive labor, lacking intelligent tools to improve efficiency. To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM). Firstly, we propose a Case-Enhanced Law Article Knowledge Graph (CLAKG) as a database to store current law statutes, historical case information, and correspondence between law articles and historical cases. Additionally, we introduce an automated CLAKG construction method based on LLM. On this basis, we propose a closed-loop law article recommendation method. Finally, through a series of experiments using judgment documents from the website "China Judgements Online", we have improved the accuracy of law article recommendation in cases from 0.549 to 0.694, demonstrating that our proposed method significantly outperforms baseline approaches.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3601 - Chen 2024
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

Chen, Yuheng; Cao, Pengfei; Chen, Yubo; Liu, Kang; Zhao, Jun

arXiv 2024;():

2024

Ref ID: 8310

Large language models (LLMs) store extensive factual knowledge, but the mechanisms behind how they store and express this knowledge remain unclear. The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. This theory is based on the knowledge localization (KL) assumption, which suggests that a fact can be localized to a few knowledge storage units, namely knowledge neurons. However, this assumption may be overly strong regarding knowledge storage and neglects knowledge expression mechanisms. Thus, we re-examine the KL assumption and confirm the existence of facts that do not adhere to it from both statistical and knowledge modification perspectives. Furthermore, we propose the Query Localization (QL) assumption. (1) Query-KN Mapping: The localization results are associated with the query rather than the fact. (2) Dynamic KN Selection: The attention module contributes to the selection of KNs for answering a query. Based on this, we further propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification. We conduct 39 sets of experiments, along with additional visualization experiments, to rigorously validate our conclusions.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#1911 - Chen 2024
Study on Entity Extraction Method for Pharmaceutical Instructions Based on Pretrained Models

Chen, Z.; Huang, Y.; Zhang, M.; Jiang, M.

J. Frontier. Comput. Sci. Technol. 2024;18(7):1911-1922

2024

DOI: 10.3778/j.issn.1673-9418.2304078 · Ref ID: 3925

The extraction of medical entities from drug instructions provides fundamental data for the intelligent retrieval of medication information and the construction of medical knowledge graphs, with remarkable research significance and practical value. However, the heterogeneity of medical entities in drug instructions for treating different diseases poses challenges in model training, which requires a large number of annotated samples. To address this issue, a“large model + small model”design approach is used in this research. Specifically, this research proposes a part-label named entity recognition model based on a pre-trained model, which first employs a pre-trained language model fine-tuned on a small number of samples to extract partial entities from drug instructions, and then utilizes a Transformer-based part-label model to further optimize the entity extraction results. The part-label model encodes the input text, identified partial entities, and entity labels using a planar lattice structure, extracts feature representations using Transformer, and predicts entity labels through a conditional random fields (CRF) layer. To reduce the need for annotated training data, a sample data augmentation method is proposed using entity masking strategy on labeled samples to train the part-label model. Experimental results validate the feasibility of the“large model + small model” approach in medical entity extraction, with precision (P), recall (R), and F1 score of 85.0%, 86.1%, and 85.6%, respectively, demonstrating superior performance compared with other learning methods. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#281 - Chen 2023
The First Workshop on Personalized Generative AI @ CIKM 2023: Personalization Meets Large Language Models

Chen, Z.; Jiang, Z. Y.; Yang, F.; He, Z. U.; Hou, Y. P.; Cho, E.; McAuley, J.; Galstyan, A.; Hu, X. H.; Acm

32nd ACM International Conference on Information and Knowledge Management (CIKM) 2023;():5267-5270

Birmingham, ENGLAND Assoc Computing Machinery 2023

DOI: 10.1145/3583780.3615314 · Ref ID: 3552

The First Workshop on Personalized Generative AI(1) aims to be a cornerstone event fostering innovation and collaboration in the dynamic field of personalized AI. Leveraging the potent capabilities of Large Language Models (LLMs) to enhance user experiences with tailored responses and recommendations, the workshop is designed to address a range of pressing challenges including knowledge gap bridging, hallucination mitigation, and efficiency optimization in handling extensive user profiles. As a nexus for academics and industry professionals, the event promises rich discussions on a plethora of topics such as the development and fine-tuning of foundational models, strategies for multi-modal personalization, and the imperative ethical and privacy considerations in LLM deployment. Through a curated series of keynote speeches, insightful panel discussions, and hands-on sessions, the workshop aspires to be a catalyst in the development of more precise, contextually relevant, and user-centric AI systems. It aims to foster a landscape where generative AI systems are not only responsive but also anticipatory of individual user needs, marking a significant stride in personalized experiences.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1934 - Chen 2023
Tele-Knowledge Pre-training for Fault Analysis

Chen, Z.; Zhang, W.; Huang, Y.; Chen, M.; Geng, Y.; Yu, H.; Bi, Z.; Zhang, Y.; Yao, Z.; Song, W.; Wu, X.; Yang, Y.; Lian, Z.; Li, Y.; Cheng, L.; Chen, H.

Proceedings - International Conference on Data Engineering 2023;2023-April():3453-3466

IEEE Computer Society 2023

DOI: 10.1109/ICDE55515.2023.00265 · Ref ID: 5233

In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-domain language pre-training model TeleBERT and its knowledge-enhanced version, a tele-knowledge re-training model KTeleBERT. which includes effective prompt hints, adaptive numerical data encoding, and two knowledge injection paradigms. Concretely, our proposal includes two stages: first, pre-training TeleBERT on 20 million tele-related corpora, and then re-training it on 1 million causal and machine-related corpora to obtain KTeleBERT. Our evaluation on multiple tasks related to fault analysis in tele-applications, including root-cause analysis, event association prediction, and fault chain tracing, shows that pretraining a language model with tele-domain data is beneficial for downstream tasks. Moreover, the KTeleBERT re-training further improves the performance of task models, highlighting the effectiveness of incorporating diverse tele-knowledge into the model. © 2023 IEEE.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3771 - Chen 2024
The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework

Chen, Zhuo; Fang, Yin; Zhang, Yichi; Guo, Lingbing; Chen, Jiaoyan; Chen, Huajun; Zhang, Wen

arXiv 2024;():

2024

Ref ID: 8173

The advancement of Multi-modal Pre-training highlights the necessity for a robust Multi-Modal Knowledge Graph (MMKG) representation learning framework. This framework is crucial for integrating structured knowledge into multi-modal Large Language Models (LLMs) at scale, aiming to alleviate issues like knowledge misconceptions and multi-modal hallucinations. In this work, to evaluate models' ability to accurately embed entities within MMKGs, we focus on two widely researched tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking for the robust integration of multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets (three for MKGC and seven for MEMA), demonstrating its robustness and versatility. Besides, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Our code and data are available at: https://github.com/zjukg/SNAG.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#197 - Cheng 2024
Editing Language Model-Based Knowledge Graph Embeddings

Cheng, S. Y.; Zhang, N. Y.; Tian, B. Z.; Chen, X.; Liu, Q. B.; Chen, H. J.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():17835-17843

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 2942

Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. This task is designed to facilitate rapid, data-efficient updates to KG embeddings without compromising the performance of other aspects. We build four new datasets: E-FB15k237, A-FB15k237, E-WN18RR, and A-WN18RR, and evaluate several knowledge editing baselines demonstrating the limited ability of previous models to handle the proposed challenging task. We further propose a simple yet strong baseline dubbed KGEditor, which utilizes additional parametric layers of the hypernetwork to edit/add facts. Our comprehensive experimental results reveal that KGEditor excels in updating specific facts without impacting the overall performance, even when faced with limited training resources. Code and datasets will be available at https://github.com/AnonymousForPapers/DeltaKG.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1047 - Cheng 2024
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments

Cheng, S.; Zhuang, Z.; Xu, Y.; Yang, F.; Zhang, C.; Qin, X.; Huang, X.; Chen, L.; Lin, Q.; Zhang, D.; Rajmohan, S.; Zhang, Q.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():4275-4295

Association for Computational Linguistics (ACL) 2024

Ref ID: 4297

Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graphs and tables. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous works adopt LLMs to incrementally build a reasoning path, where LLMs either invoke tools or pick up items by step-by-step interacting with the environment. We propose Reasoning-Path-Editing (Readi), a novel framework where LLMs can efficiently and faithfully reason over structured environments. In Readi, LLMs initially generate a reasoning path given a query, and edit the path only when necessary. We instantiate the path on structured environments and provide feedback to edit the path if anything goes wrong. Experimental results on three KGQA and two TableQA datasets show the effectiveness of Readi, significantly surpassing previous LLM-based methods (by 9.1% Hit@1 on WebQSP, 12.4% on MQA-3H and 9.5% on WTQ), comparable with state-of-the-art fine-tuned methods (67% on CWQ and 74.7% on WebQSP) and substantially boosting the vanilla LLMs (by 14.9% on CWQ). Our code will be available on https://aka.ms/readi. © 2024 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1771 - Chepurova 2024
Prompt Me One More Time: A Two-Step Knowledge Extraction Pipeline with Ontology-Based Verification

Chepurova, A.; Kuratov, Y.; Bulatov, A.; Burtsev, M.

TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():61-77

Association for Computational Linguistics (ACL) 2024

Ref ID: 4410

This study explores a method for extending real-world knowledge graphs (specifically, Wikidata) by extracting triplets from texts with the aid of Large Language Models (LLMs). We propose a two-step pipeline that includes the initial extraction of entity candidates, followed by their refinement and linkage to the canonical entities and relations of the knowledge graph. Finally, we utilize Wikidata relation constraints to select only verified triplets. We compare our approach to a model that was fine-tuned on a machine-generated dataset and demonstrate that it performs better on natural data. Our results suggest that LLM-based triplet extraction from texts, with subsequent verification, is a viable method for real-world applications. © 2024 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#2303 - Cherniahovskaya 2006
Decision support in strategic control on the base of knowledge management

Cherniahovskaya, L.; Nougayeva, K.

2006 IEEE International Technology Management Conference (ICE) 2006;():1-4

2006

DOI: 10.1109/ICE.2006.7477068 · Ref ID: 6218

The paper represents solution of the problem of strategic control decisions quality increasing on the base of knowledge management. The hypertext knowledge base for collaborative knowledge gathering, storing, management and presentation is developed. Objective-cognitive analysis methodology is presented for the hypertext knowledge base design. This methodology integrates methods of the objective analysis and design with the Unified Modeling Language, semantic analysis and ontology analysis of domain. The algorithm of the case based reasoning for the decision support is presented. There is also shown the sample of application of the intelligent decision support system in education process.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1598 - Chernomorchenko 2024
Leveraging Taxonomic Information from Large Language Models for Hyponymy Prediction

Chernomorchenko, P.; Panchenko, A.; Nikishina, I.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14486 LNCS():49-63

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-3-031-54534-4_4 · Ref ID: 4614

Pre-trained language models contain a vast amount of linguistic information as well as knowledge about the structure of the world. Both of these attributes are extremely beneficial for automatic enrichment of semantic graphs, such as knowledge bases and lexical-semantic databases. In this article, we employ generative language models to predict descendants of existing nodes in lexical data structures based on IS-A relations, such as WordNet. To accomplish this, we conduct experiments utilizing diverse formats of artificial text input containing information from lexical taxonomy for the English and Russian languages. Our findings demonstrate that the incorporation of data from the knowledge graph into a text input significantly affects the quality of hyponym prediction. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#551 - Chi 2024
Maximizing the Social Welfare of Decentralized Knowledge Inference through Evolutionary Game

Chi, Y. F.; Zhang, Q. Y.; Sun, J. X.; Cai, W.; Wang, Z. J.; Leung, V. C. M.; Ieee

IEEE Conference on Computer Communications (IEEE INFOCOM) 2024;():

Vancouver, CANADA Ieee 2024

DOI: 10.1109/infocomwkshps61880.2024.10620663 · Ref ID: 3481

To broaden their domain knowledge coverage, large language models (LLMs) increasingly incorporate extensive corpus data from various industries. These heterogeneous datasets are often maintained by different stakeholders, where issues of data heterogeneity, privacy, and the network cost of data transmission have attracted much attention. To address these challenges, researchers have studied the integration of LLMs with knowledge graphs to manage data heterogeneity and with edge computing to ensure data privacy and transmission efficiency. In this work, we introduce a reputation system and a spot-check mechanism for a decentralized knowledge inference system in which edge nodes can collaborate with others for knowledge sharing while preserving their data privacy. We then use an evolutionary game model to study the dynamic decision-making between requestors and workers. Moreover, we show that higher reward values and higher model quality accelerate the maximization of social welfare.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#426 - Chis 2024
A Knowledge Graph Approach to Cyber Threat Mitigation Derived from Data Flow Diagrams

Chis, A.; Stoica, O. I.; Ghiran, A. M.; Buchmann, R. A.; Ieee

International Conference on Automation, Quality and Testing, Robotics (AQTR) 2024;():71-76

Cluj-Napoca, ROMANIA Ieee 2024

DOI: 10.1109/aqtr61889.2024.10554074 · Ref ID: 3002

Data Flow Diagrams (DFD) have proven effective in designing and analyzing the flow of data in enterprise systems. They serve as indispensable tools for enterprises that are undergoing transition to cloud services. DFDs aid in understanding the current processes, identifying interfaces and integration points that require security measures. This paper reports a Design Science project to mitigate the cyber security threats at the design phase of a system and to perform auditing of an existing system through knowledge graphs. The proposal leverages knowledge gathered from various sources in a knowledge graph to identify semantic relationships and patterns, enabling automated inference, analysis and detection of vulnerability patterns. Furthermore, LLM-based (large language models) capabilities transform data management details captured as Data Flow Diagrams (DFD) into knowledge graphs for semantic querying and improved decision support.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2520 - Chishti 2014
A grounding of business process modeling based on temporal logic

Chishti, I.

International Conference on Information Society (i-Society 2014) 2014;():266-273

2014

DOI: 10.1109/i-Society.2014.7009058 · Ref ID: 6130

This paper proposes grounding for the business process modeling (BPM) based on general time theory providing axiomatic system. First order logic is used to give a clear definition of abstract business process and corresponding temporal relations including derived relations using a single “Meets” relation. Temporal logic used here treats time interval and time points on equal footing. We use model theoretic approach, in which abstract business process is represented as a formal system and mapped to an instance/concrete realization. Also, we used resolution theorem to provide its soundness and completeness properties. A Process temporal graph as a directed graph is introduced with graphical notation defined to represent the temporal knowledge. A real world realization of the corresponding graph is considered an instance of an abstract business process. Sound and completeness properties of the process temporal graph using reachability analysis. However, Arcs representing time elements, vertex representing the `Meets' relation and also allows expression of both logical AND and OR.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3131 - Cho 2024
FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning

Cho, Nicole; Srishankar, Nishan; Cecchi, Lucas; Watson, William

Proceedings of the 5th ACM International Conference on AI in Finance 2024;():591–599

Brooklyn, NY, USA Association for Computing Machinery 2024

DOI: 10.1145/3677052.3698597 · Ref ID: 7291

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#317 - Cho 2023
Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies

Cho, Y. E.

Linguist. Res. 2023;40(2):317-352

2023

DOI: 10.17250/khisli.40.2.202306.007 · Ref ID: 3559

Cho, Ye-eun. 2023. Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies. Linguistic Research 40(2): 317-352. The phenomenon of attraction effects, whereby a verb erroneously retrieves a syntactically inaccessible but feature-matching noun, is a type of grammatical illusions (Phillips, Wagers, and Lau 2011) that can occur in long-distance subject-verb agreement in human sentence processing (Wagers et al. 2009). In contrast, reflexive-antecedent dependencies have been claimed to lack attraction effects when the reflexive and the antecedent mismatch (Dillon et al. 2013). Yet, some other studies have shown that attraction effects have been observed in reflexive-antecedent dependencies, when the number of feature mismatch between the reflexive and the antecedent increases (Parker and Philips 2017). These findings suggest that there are different cue weightings based on the predictability of the dependency, and these cues are combined according to different cue-combination scheme, such as a linear or a non-linear cue-combination rule (Parker 2019). These linguistic phenomena can be used to analyze how linguistic features are accessed and combined within the internal states of Deep Neural Network (DNN) language models. In the linguistic representations of BERT (Devlin et al. 2018), one of the pre-trained DNN language models, various types of linguistic information are encoded in each layer (Jawahar et al. 2019) and combined while passing through the layers. By measuring the performance of Masked Language Model (MLM), this study finds that both subject-verb agreement and reflexive-antecedent dependencies show attraction effects and follow the linear-combinatoric rule in BERT. The different results from human sentence processing suggest that the self-attention mechanism of BERT may not be able to capture the differences in the predictability of the dependency as effectively as memory retrieval mechanisms in humans. These findings have important implications for developing more understandable and interpretable explainable-AI (xAI) systems that better capture the complexities of human language processing. (Sungkyunkwan University)

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#561 - Choi 2021
MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model

Choi, B.; Jang, D.; Ko, Y.

IEEE Access 2021;9():132025-132032

2021

DOI: 10.1109/access.2021.3113329 · Ref ID: 3008

The knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming-they cannot train entity embedding when an entity does not appear in the training phase. As a result, such models use randomly initialized embeddings for entities that are unseen in the training phase and cause a critical decrease in performance during the test phase. To solve this problem, we propose a new approach that performs KGC task by utilizing the masked language model (MLM) that is used for a pre-trained language model. Given a triple (head entity, relation, tail entity), we mask the tail entity and consider the head entity and the relation as a context for the tail entity. The model then predicts the masked entity from among all entities. Then, the task is conducted by the same process as an MLM, which predicts a masked token with a given context of tokens. Our experimental results show that the proposed model achieves significantly improved performances when unseen entities appear during the test phase and achieves state-of-the-art performance on the WN18RR dataset.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#441 - Choi 2023
Knowledge graph extension with a pre-trained language model via unified learning method

Choi, B.; Ko, Y. J.

Knowledge-Based Syst. 2023;262():8

2023

DOI: 10.1016/j.knosys.2022.110245 · Ref ID: 3049

Knowledge graphs (KGs) are collections of real-world knowledge that is represented by a structured form of triples. Since they are manually built in their nascent stage, there is a common problem that some links (triples) are missing. Knowledge graph completion (KGC) aims to find those missing links and thereby complete the KGs. However, as knowledge increases through diverse sources, new entities have explosively emerged and they are needed to be connected to existing KGs. Thus, open-world KGC is targeted on extending KGs to those new entities. Dealing with those new entities is challenging because they do not have any connection with entities in the existing KGs. One way to handle the new ones is to embed them with their textual descriptions with pre-trained word embeddings and score them in the graph-vector space with the existing typical KGC models. These models have resulted in meaningful results but there is still a lack of studies on utilizing the latest neural networks, such as pre-trained language models which are known to be better at capturing contexts than pre-trained word embeddings. This paper proposes a novel model that effectively connects new entities and existing KGs through a pre-trained language model. To effectively handle the problem, we utilize two learning methods; one is the classification method of the masked language model (MLM) that predicts a word among a huge vocabulary set with a given context, and the other is multi-task learning based on the Multi-Task for Deep Neural Networks (MT-DNN). Based on the methods, the model first generates an embedding of a new entity using its textual description and then uses the embedding to find one of the existing entities from a KG where the new entity can be connected. The experimental results on three benchmark datasets, DBPedia50k, FB15k-237-OWE, and FB20k, show that the proposed model improves performances by 9.2%p, 4.4%p, and 11.1%p, respectively, and achieves new state-of-the-art performance for all datasets. (c) 2022 Elsevier B.V. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#14 - Choi 2023
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for Commonsense Question Answering

Choi, B.; Lee, Y.; Kyung, Y.; Kim, E.

Intell. Autom. Soft Comput. 2023;36(1):71-82

2023

DOI: 10.32604/iasc.2023.032783 · Ref ID: 2964

Recently, pre-trained language representation models such as bidirec-tional encoder representations from transformers (BERT) have been performing well in commonsense question answering (CSQA). However, there is a problem that the models do not directly use explicit information of knowledge sources existing outside. To augment this, additional methods such as knowledge-aware graph network (KagNet) and multi-hop graph relation network (MHGRN) have been proposed. In this study, we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers (ALBERT) with knowledge graph information extraction technique. We also propose to applying the novel method, schema graph expansion to recent language models. Then, we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent. Furthermore, we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3800 - Choi 2024
Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games

Choi, Juhwan; Kim, YoungBin

arXiv 2024;():

2024

Ref ID: 8588

Large language models (LLMs) have become a dominant approach in natural language processing, yet their internal knowledge structures remain largely unexplored. In this paper, we analyze the internal knowledge structures of LLMs using historical medal tallies from the Olympic Games. We task the models with providing the medal counts for each team and identifying which teams achieved specific rankings. Our results reveal that while state-of-the-art LLMs perform remarkably well in reporting medal counts for individual teams, they struggle significantly with questions about specific rankings. This suggests that the internal knowledge structures of LLMs are fundamentally different from those of humans, who can easily infer rankings from known medal counts. To support further research, we publicly release our code, dataset, and model outputs.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1237 - Choi 2024
Embodied CoT Distillation From LLM To Off-the-shelf Agents

Choi, W.; Kim, W. K.; Yoo, M.; Woo, H.

Proceedings of Machine Learning Research 2024;235():8702-8721

ML Research Press 2024

Ref ID: 4359

We address the challenge of utilizing large language models (LLMs) for complex embodied tasks, in the environment where decision-making systems operate timely on capacity-limited, off-the-shelf devices. We present DEDER, a framework for decomposing and distilling the embodied reasoning capabilities from LLMs to efficient, small language model (sLM)-based policies. In DEDER, the decision-making process of LLM-based strategies is restructured into a hierarchy with a reasoning-policy and planning-policy. The reasoning-policy is distilled from the data that is generated through the embodied in-context learning and self-verification of an LLM, so it can produce effective rationales. The planning-policy, guided by the rationales, can render optimized plans efficiently. In turn, DEDER allows for adopting sLMs for both policies, deployed on off-the-shelf devices. Furthermore, to enhance the quality of intermediate rationales, specific to embodied tasks, we devise the embodied knowledge graph, and to generate multiple rationales timely through a single inference, we also use the contrastively prompted attention model. Our experiments with the ALFRED benchmark demonstrate that DEDER surpasses leading language planning and distillation approaches, indicating the applicability and efficiency of sLM-based embodied policies derived through DEDER. Copyright 2024 by the author(s)

Ishan voted
Kwesi voted
Final decision
What was the agreed final decision?

#3338 - Choubey 2024
Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency

Choubey, Prafulla Kumar; Su, Xin; Luo, Man; Peng, Xiangyu; Xiong, Caiming; Le, Tiep; Rosenman, Shachar; Lal, Vasudev; Mui, Phil; Ho, Ricky; Howard, Phillip; Wu, Chien-Sheng

arXiv 2024;():

2024

Ref ID: 8740

Knowledge graphs (KGs) generated by large language models (LLMs) are becoming increasingly valuable for Retrieval-Augmented Generation (RAG) applications that require knowledge-intensive reasoning. However, existing KG extraction methods predominantly rely on prompt-based approaches, which are inefficient for processing large-scale corpora. These approaches often suffer from information loss, particularly with long documents, due to the lack of specialized design for KG construction. Additionally, there is a gap in evaluation datasets and methodologies for ontology-free KG construction. To overcome these limitations, we propose SynthKG, a multi-step, document-level ontology-free KG synthesis workflow based on LLMs. By fine-tuning a smaller LLM on the synthesized document-KG pairs, we streamline the multi-step process into a single-step KG generation approach called Distill-SynthKG, substantially reducing the number of LLM inference calls. Furthermore, we re-purpose existing question-answering datasets to establish KG evaluation datasets and introduce new evaluation metrics. Using KGs produced by Distill-SynthKG, we also design a novel graph-based retrieval framework for RAG. Experimental results demonstrate that Distill-SynthKG not only surpasses all baseline models in KG quality – including models up to eight times larger – but also consistently excels in retrieval and question-answering tasks. Our proposed graph retrieval framework also outperforms all KG-retrieval methods across multiple benchmark datasets. We release the SynthKG dataset and Distill-SynthKG model publicly to support further research and development.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3422 - Chuang 2024
FaithLM: Towards Faithful Explanations for Large Language Models

Chuang, Yu-Neng; Wang, Guanchu; Chang, Chia-Yuan; Tang, Ruixiang; Zhong, Shaochen; Yang, Fan; Du, Mengnan; Cai, Xuanting; Hu, Xia

arXiv 2024;():

2024

Ref ID: 8080

Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their extensive internal knowledge and reasoning capabilities. However, the black-box nature of these models complicates the task of explaining their decision-making processes. While recent advancements demonstrate the potential of leveraging LLMs to self-explain their predictions through natural language (NL) explanations, their explanations may not accurately reflect the LLMs' decision-making process due to a lack of fidelity optimization on the derived explanations. Measuring the fidelity of NL explanations is a challenging issue, as it is difficult to manipulate the input context to mask the semantics of these explanations. To this end, we introduce FaithLM to explain the decision of LLMs with NL explanations. Specifically, FaithLM designs a method for evaluating the fidelity of NL explanations by incorporating the contrary explanations to the query process. Moreover, FaithLM conducts an iterative process to improve the fidelity of derived explanations. Experiment results on three datasets from multiple domains demonstrate that FaithLM can significantly improve the fidelity of derived explanations, which also provides a better alignment with the ground-truth explanations.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1342 - Colas 2022
GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text Generation

Colas, A.; Alvandipour, M.; Wang, D. Z.

Proceedings - International Conference on Computational Linguistics, COLING 2022;29():5755-5769

Association for Computational Linguistics (ACL) 2022

Ref ID: 5306

Recent improvements in KG-to-text generation are due to additional auxiliary pre-training tasks designed to give the fine-tune task a boost in performance. These tasks require extensive computational resources while only suggesting marginal improvements. Here, we demonstrate that by fusing graph-aware elements into existing pre-trained language models, we are able to outperform state-of-the-art models and close the gap imposed by additional pre-training tasks. We do so by proposing a mask structure to capture neighborhood information and a novel type encoder that adds a bias to the graph-attention weights depending on the connection type. Experiments on two KG-to-text benchmark datasets show our models are competitive while involving fewer parameters and no additional pre-training tasks. By formulating the problem as a framework, we can interchange the various proposed components and begin interpreting KG-to-text generative models based on the topological and type information found in a graph. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3651 - Colombo 2024
Leveraging Knowledge Graphs and LLMs to Support and Monitor Legislative Systems

Colombo, Andrea

arXiv 2024;():

2024

Ref ID: 8612

Knowledge Graphs (KGs) have been used to organize large datasets into structured, interconnected information, enhancing data analytics across various fields. In the legislative context, one potential natural application of KGs is modeling the intricate set of interconnections that link laws and their articles with each other and the broader legislative context. At the same time, the rise of large language models (LLMs) such as GPT has opened new opportunities in legal applications, such as text generation and document drafting. Despite their potential, the use of LLMs in legislative contexts is critical since it requires the absence of hallucinations and reliance on up-to-date information, as new laws are published on a daily basis. This work investigates how Legislative Knowledge Graphs and LLMs can synergize and support legislative processes. We address three key questions: the benefits of using KGs for legislative systems, how LLM can support legislative activities by ensuring an accurate output, and how we can allow non-technical users to use such technologies in their activities. To this aim, we develop Legis AI Platform, an interactive platform focused on Italian legislation that enhances the possibility of conducting legislative analysis and that aims to support lawmaking activities.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3270 - Colon-Hernandez 2021
Combining pre-trained language models and structured knowledge

Colon-Hernandez, Pedro; Havasi, Catherine; Alonso, Jason; Huggins, Matthew; Breazeal, Cynthia

arXiv 2021;():

2021

Ref ID: 7436

In recent years, transformer-based language models have achieved state of the art performance in various NLP benchmarks. These models are able to extract mostly distributional information with some semantics from unstructured text, however it has proven challenging to integrate structured information, such as knowledge graphs into these models. We examine a variety of approaches to integrate structured knowledge into current language models and determine challenges, and possible opportunities to leverage both structured and unstructured information sources. From our survey, we find that there are still opportunities at exploiting adapter-based injections and that it may be possible to further combine various of the explored approaches into one system.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3179 - Colon-Hernandez 2023
Adversarial Transformer Language Models for Contextual Commonsense Inference

Colon-Hernandez, Pedro; Lieberman, Henry; Xin, Yida; Yin, Claire; Breazeal, Cynthia; Chin, Peter

arXiv 2023;():

2023

Ref ID: 7647

Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions (i.e., facts) from a given story, and a particular sentence from that story. Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training; and, possibly, hallucinated or false facts. In this work, we utilize a transformer model for this task and develop techniques to address the aforementioned problems in the task. We control the inference by introducing a new technique we call "hinting". Hinting is a kind of language model prompting, that utilizes both hard prompts (specific words) and soft prompts (virtual learnable templates). This serves as a control signal to advise the language model "what to talk about". Next, we establish a methodology for performing joint inference with multiple commonsense knowledge bases. Joint inference of commonsense requires care, because it is imprecise and the level of generality is more flexible. You want to be sure that the results "still make sense" for the context. To this end, we align the textual version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and GLUCOSE) with a story and a target sentence. This combination allows us to train a single model to perform joint inference with multiple knowledge graphs. We show experimental results for the three knowledge graphs on joint inference. Our final contribution is exploring a GAN architecture that generates the contextualized commonsense assertions and scores them as to their plausibility through a discriminator. The result is an integrated system for contextual commonsense inference in stories, that can controllably generate plausible commonsense assertions, and takes advantage of joint inference between multiple commonsense knowledge bases.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#2026 - Corrado 2023
VO.I.C.E. FIRST: Supporting Human Assistants with Real-Time Voice Understanding

Corrado, M.; Giliberti, V.; Gozzi, M.; Lanzolla, V.; Vetere, G.; Zurlo, D.

2023 IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2023 - Proceedings 2023;():1104-1109

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/MetroXRAINE58569.2023.10405568 · Ref ID: 4965

While AI and automation have made significant strides in customer support, there are still situations where human intervention via voice channels is necessary to provide the best possible customer experience. In fact, although AI and chatbots have become increasingly sophisticated, they may not always be able to handle complex or nuanced customer issues. Human agents can better understand and respond to these situations, providing tailored solutions. At the same time, solving non-trivial customer problems often requires access to knowledge bases and contextual customer information, for which AI is particularly well suited. Hence the idea of integrating human and artificial intelligence in a hybrid solution. We developed an AI system to help human assistants in the process of handling conversations. This system can be viewed as a collaborative bot (cobot). The cobot captures the audio stream of the conversation, converts it to text and analyzes it in real time. The extracted tokens are classified and sent to a reasoning system based on a knowledge graph, that provides information and action suggestions to the human assistant. Assistants are also capable of providing information to the reasoning system, utilizing their human understanding of the client's circumstances as they unfold. While designing a prototypical solution for utility services, we have faced the problem of real-time use of computationally complex procedures, including spontaneous speech understanding and knowledge-based heuristic rules. Moreover, we adopted a standards-based approach and experimented with open source reasoners and publicly available language models. The paper outlines the system architecture and design, and discusses the results of the first experiments. © 2023 IEEE.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#37 - Corrado 2023
Assisting the Assistant: A Cobot for Voice Customer Support

Corrado, M.; Giliberti, V.; Gozzi, M.; Lanzolla, V.; Vetere, G.; Zurlo, D.

2nd International Conference on Hybrid Human-Artificial Intelligence (HHAI) 2023;368():330-339

Munich, GERMANY Ios Press 2023

DOI: 10.3233/faia230096 · Ref ID: 3445

Despite recent advances in automation, customer support still requires a substantial amount of human intervention through voice channels. With the aim of improving the work of human assistants, we developed a collaborative bot (cobot) to help them in the process of handling customer voice interactions. The cobot is a reasoning agent that starts from loading background customer data into a dynamic knowledge graph. Then it captures the audio stream of the conversation, converts it to text in real time, analyzes the blocks of conversation with neural technologies and "thinks" about the results. Assistants can also supply data to the cobot, based on the information they gather from the ongoing conversation. The reasoning agent provides information and action suggestions to the human assistant by applying heuristics on data collected from both automatic and human sources, based on a task and domain-specific conceptual models (ontologies). While designing a prototypical solution for utility services in Italy, we are faced with many problems, including spontaneous speech understanding, factual and linguistic knowledge representation, and efficient heuristic reasoning. We adopted a standards-based approach and experimented with open source reasoners and publicly available language models. The paper presents preliminary findings and outlines the system design, with focus on the interplay of neural language processing and logic reasoning.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3355 - D'Abramo 2024
Dynamic Few-Shot Learning for Knowledge Graph Question Answering

D'Abramo, Jacopo; Zugarini, Andrea; Torroni, Paolo

arXiv 2024;():

2024

Ref ID: 8439

Large language models present opportunities for innovative Question Answering over Knowledge Graphs (KGQA). However, they are not inherently designed for query generation. To bridge this gap, solutions have been proposed that rely on fine-tuning or ad-hoc architectures, achieving good results but limited out-of-domain distribution generalization. In this study, we introduce a novel approach called Dynamic Few-Shot Learning (DFSL). DFSL integrates the efficiency of in-context learning and semantic similarity and provides a generally applicable solution for KGQA with state-of-the-art performance. We run an extensive evaluation across multiple benchmark datasets and architecture configurations.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1174 - D’Aragona 2024
Design of a Knowledge Hub of Heterogeneous Multisource Documents to support Public Authorities

D’Aragona, P. T. A.; Babbini, L.; Bordogna, G.; Lotti, A.; Minelli, A.; Oggioni, A.

CEUR Workshop Proceedings 2024;3762():430-435

CEUR-WS 2024

Ref ID: 4200

This contribution outlines the design of a Knowledge Hub of heterogeneous documents related to the Mediterranean Action Plan UNEP-MAP of the United Nations Environment Program [1]. The Knowledge Hub is intended to serve as a resource to assist public authorities and users with different backgrounds and needs in accessing information efficiently. Users can either formulate natural language queries or navigate a knowledge graph automatically generated to find relevant documents. The Knowledge Hub is designed based on state-of-the-art Large Language Models. (LLMs) A user-evaluation experiment was conducted, testing publicly available models on a subset of documents using distinct LLMs settings. This step was aimed to identify the best-performing model for further using it to classify the documents with respect to the topics of interest. © 2024 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1285 - D’Souza 2023
Evaluating Prompt-Based Question Answering for Object Prediction in the Open Research Knowledge Graph

D’Souza, J.; Hrou, M.; Auer, S.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14146 LNCS():508-515

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-39847-6_40 · Ref ID: 5201

Recent investigations have explored prompt-based training of transformer language models for new text genres in low-resource settings. This approach has proven effective in transferring pre-trained or fine-tuned models to resource-scarce environments. This work presents the first results on applying prompt-based training to transformers for scholarly knowledge graph object prediction. Methodologically, it stands out in two main ways: 1) it deviates from previous studies that propose entity and relation extraction pipelines, and 2) it tests the method in a significantly different domain, scholarly knowledge, evaluating linguistic, probabilistic, and factual generalizability of large-scale transformer models. Our findings demonstrate that: i) out-of-the-box transformer models underperform on the new scholarly domain, ii) prompt-based training improves performance by up to 40% in relaxed evaluation, and iii) tests of the models in a distinct domain reveals a gap in capturing domain knowledge, highlighting the need for increased attention and resources in the scholarly domain for transformer models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3182 - Da 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models

Da, Jeff; Bras, Ronan Le; Lu, Ximing; Choi, Yejin; Bosselut, Antoine

arXiv 2021;():

2021

Ref ID: 7434

Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is learned during pretraining or from fine-tuning on KG examples. To investigate this question, we train commonsense knowledge models in few-shot settings to study the emergence of their commonsense representation abilities. Our results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining. Importantly, our analysis of absolute, angular, and distributional parameter changes during few-shot fine-tuning provides novel insights into how this interface is learned.

Xinchen voted
Ishan voted
Final decision
What was the agreed final decision?

#316 - Dai 2025
A GPT-assisted iterative method for extracting domain knowledge from a large volume of literature of electromagnetic wave absorbing materials with limited manually annotated data

Dai, D. B.; Zhang, G. J.; Wei, X.; Lin, Y. D.; Dai, M. M.; Peng, J. J.; Song, N.; Tang, Z.; Li, S. Z.; Liu, J. W.; Xu, Y.; Che, R. C.; Zhang, H. R.

Comput. Mater. Sci. 2025;246():11

2025

DOI: 10.1016/j.commatsci.2024.113431 · Ref ID: 3745

Research on electromagnetic wave absorbing materials is an important part of materials science. Each year, a substantial amount of academic literature is published in this field, containing critical information. Rapid and effective knowledge extraction from these documents is key to accelerating field development, and automated knowledge extraction based on deep learning provides a solution to this challenge. However, deep learning models typically require extensive annotated data for training, which is time-consuming and expensive to obtain in highly specialized subfields. To address this issue, this paper presents a GPT-assisted iterative training method that uses only 30 manually annotated literature abstracts as a training set and ultimately achieves an F1 score of 82.94% for a named entity recognition model (NER). The effectiveness of this model is demonstrated by comparing with other large language models commonly used in materials science on a custom dataset. We constructed a knowledge extraction framework centered around the obtained NER model and collected literature on electromagnetic wave absorbing materials from the last decade. The extraction and application results demonstrate the practicality of our framework.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#461 - Dai 2022
Knowledge Neurons in Pretrained Transformers

Dai, D. M.; Dong, L.; Hao, Y. R.; Sui, Z. F.; Chang, B. B.; Wei, F. R.; Assoc Computat, Linguist

60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():8493-8502

Dublin, IRELAND Assoc Computational Linguistics-Acl 2022

Ref ID: 3578

Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus (Petroni et al., 2019; Jiang et al., 2020b). In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. Specifically, we examine the fill-in-the-blank cloze task for BERT. Given a relational fact, we propose a knowledge attribution method to identify the neurons that express the fact. We find that the activation of such knowledge neurons is positively correlated to the expression of their corresponding facts. In our case studies, we attempt to leverage knowledge neurons to edit (such as update, and erase) specific factual knowledge without fine-tuning. Our results shed light on understanding the storage of knowledge within pretrained Transformers. The code is available at https://github.com/Hunter-DDM/knowledge-neurons.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1491 - Dalal 2021
Knowledge augmented language models for causal question answering

Dalal, D.

CEUR Workshop Proceedings 2021;3005():17-24

CEUR-WS 2021

Ref ID: 5694

The task of causal question answering broadly involves reasoning about causal relations and causality over a provided premise. Causal question answering can be expressed across a variety of tasks including commonsense question answering, procedural reasoning, reading comprehension, and abductive reasoning. Transformer-based pretrained language models have shown great promise across many natural language processing (NLP) applications. However, these models are reliant on distributional knowledge learned during the pretraining process and are limited in their causal reasoning capabilities. Causal knowledge, often represented as cause-effect triples in a knowledge graph, can be used to augment and improve the causal reasoning capabilities of language models. There is limited work exploring the efficacy of causal knowledge for question answering tasks. We consider the challenge of structuring causal knowledge in language models and developing a unified model that can solve a broad set of causal question answering tasks. Copyright © 2021 for this paper by its authors.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3870 - Dani 2024
SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research

Dani, Meghal; Prakash, Muthu Jeyanthi; Akata, Zeynep; Liebe, Stefanie

arXiv 2024;():

2024

Ref ID: 8444

Large Language Models have shown promising results in their ability to encode general medical knowledge in standard medical question-answering datasets. However, their potential application in clinical practice requires evaluation in domain-specific tasks, where benchmarks are largely missing. In this study semioLLM, we test the ability of state-of-the-art LLMs (GPT-3.5, GPT-4, Mixtral 8x7B, and Qwen-72chat) to leverage their internal knowledge and reasoning for epilepsy diagnosis. Specifically, we obtain likelihood estimates linking unstructured text descriptions of seizures to seizure-generating brain regions, using an annotated clinical database containing 1269 entries. We evaluate the LLM's performance, confidence, reasoning, and citation abilities in comparison to clinical evaluation. Models achieve above-chance classification performance with prompt engineering significantly improving their outcome, with some models achieving close-to-clinical performance and reasoning. However, our analyses also reveal significant pitfalls with several models being overly confident while showing poor performance, as well as exhibiting citation errors and hallucinations. In summary, our work provides the first extensive benchmark comparing current SOTA LLMs in the medical domain of epilepsy and highlights their ability to leverage unstructured texts from patients' medical history to aid diagnostic processes in health care.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#652 - Das 2018
Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR

Das, M.; Fosler-Lussier, E.; Lin, S.; Moosavinasab, S.; Chen, D.; Rust, S.; Huang, Y. G.; Ramnath, R.; Assoc Computat, Linguist

SIGBioMed 17th Workshop on Biomedical Natural Language Processing (BioNLP) 2018;():118-128

Melbourne, AUSTRALIA Assoc Computational Linguistics-Acl 2018

Ref ID: 3287

In fact-based information retrieval, state-of-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries may require understanding query intent by capturing novel associations between potentially latent concepts, these systems can fall short. In this work, we develop a novel, completely unsupervised, neural language model-based ranking approach for semantic tagging of documents, using the document to be tagged as a query into the model to retrieve candidate phrases from top-ranked related documents, thus associating every document with novel related concepts extracted from the text. For this we extend the word embedding-based generalized language model (GLM) due to (Ganguly et al., 2015), to employ phrasal embeddings, and use the semantic tags thus obtained for downstream query expansion, both directly and in feedback loop settings. Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert-assigned concept tags for the queries, on top of a standard Okapi BM25-based document retrieval system.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#965 - Das 2024
AcKnowledge: Acquired Knowledge Representation by Small Language Model Without Pre-training

Das, S.; Chatterji, S.; Mukherjee, I.

KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():83-95

Association for Computational Linguistics (ACL) 2024

Ref ID: 4302

Large language models (LLMs) are pre-trained on enormous amounts of text data and show acclaimed success in knowledge representation. However, there are two bottlenecks with this approach. (1) Pre-training data cannot be regularly updated once the models are deployed, and it is not very fruitful if the model cannot represent updated knowledge. (2) The consistently increasing size and computational resources make it difficult for noncommercial and individual researchers to fine-tune and scale these language models. Major LLMs with external knowledge are also proprietary. In this paper, we propose AcKnowledge, a framework wrapped around a small, non-pre-trained language model for an open-domain question-answering (QA) experiment. AcKnowledge learns relevant knowledge from the internet via meta-learning based on user questions, and re-learns from user feedback if knowledge is misrepresented. Our efficient knowledge representation framework avoids pre-training overhead while enabling updated information. Benchmarking shows competitive performance against similarly sized state-of-the-art (SoTA) LLMs on gold standard QA datasets, demonstrating the potential of integrating internet search and user feedback for improved performance and general-izability. The repository of the work is available at https://github.com/SouravD-Me/AcKnowledge-KnowledgeLM-ACL-2024. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#2261 - Dasgupta 2013
A comprehensive sensor taxonomy and semantic knowledge representation: Energy meter use case

Dasgupta, R.; Dey, S.

2013 Seventh International Conference on Sensing Technology (ICST) 2013;():791-799

2013

DOI: 10.1109/ICSensT.2013.6727761 · Ref ID: 6109

The increasing use of sensors and their observations in applications like environmental monitoring, security and surveillance, health care, infrastructure, meteorology and others not only generate huge amount of sensor data but also increase complexity of integration of heterogeneous sensor devices, their data formats and procedures of measurements. Therefore ways to manage sensors, sensing devices and systems and thereby handling generation of large volume of sensor data is becoming very important. Formal definition of sensor data encodings and web services to store and access them given by Sensor Web Enablement (SWE) initiative of Open Geospatial Consortium (OGC) provide syntactic interoperability but collecting, reasoning, querying on sensors and their observations require sensor semantic compatibility. It allows users to work with domain concepts, their relations and restrictions, which is an abstraction above the technical nitty-gritty of diverse sensor data format and their integration. The paper describes various sensor concepts and their relationships extending IEEE SUMO upper level ontology and OntoSensor, including SensorML and classifies sensor information into five major sensor knowledge representation (1) hierarchy (2) data (3) function (4) data exchange and (5) domain specific along with code snippets of semantic services generated by mapping between conceptual relationships with structural relationships described in object oriented languages like C++ or Java.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3284 - Datta 2024
Construction of Hyper-Relational Knowledge Graphs Using Pre-Trained Large Language Models

Datta, Preetha; Vitiugin, Fedor; Chizhikova, Anastasiia; Sawhney, Nitin

arXiv 2024;():

2024

Ref ID: 8188

Extracting hyper-relations is crucial for constructing comprehensive knowledge graphs, but there are limited supervised methods available for this task. To address this gap, we introduce a zero-shot prompt-based method using OpenAI's GPT-3.5 model for extracting hyper-relational knowledge from text. Comparing our model with a baseline, we achieved promising results, with a recall of 0.77. Although our precision is currently lower, a detailed analysis of the model outputs has uncovered potential pathways for future research in this area.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1365 - Datta 2023
GREAT AI in Medical Appropriateness and Value-Based-Care

Datta, V. D.; Ganesh, S.; Haas, R. E.; Talukder, A. K.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14418 LNCS():16-33

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-49601-1_2 · Ref ID: 5038

Fee For Service, also known as Volume Based Care (VBC) model of healthcare encourages service volume – more service more reward. This model of care results in unnecessary, inappropriate, and wasted medical services. In the US, Fraud, Waste, and Abuse (FWA) ranges between $760 billion to $935 billion, accounting for approximately 25% of total healthcare spending. In India, the waste caused by FWA is estimated to be as high as 35%. This is due to a lack of smart digital health, absence of AI models, and lack of preventive vigilance against inappropriate medical interventions. Inappropriate medical intervention costs valuable resources and causes patient harm. This paper proposes GREAT AI (Generative, Responsible, Explainable, Adaptive, and Trustworthy Artificial Intelligence) in Medical Appropriateness. We show how GREAT AI is used to offer appropriate medical services. Moreover, we show how GREAT AI can function in vigilance role to curb FWA. We present two GREAT AI models namely MAKG (Medical Appropriateness Knowledge Graph) and RAG-GPT (Retrieval Augmented Generation – Generative Pretrained Transformer). MAKG is used as an autonomous coarse-grained medical-inappropriateness vigilance model for payers and regulators. Whereas RAG-GPT is used as a fine-grained LLM, with human-in-the-loop for medical appropriateness and medical inappropriateness model where the actor human-in-the loop can be anybody like providers, patients, payers, regulators, funders, or researchers. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#991 - deÁvilaMendes 2024
Application of Generative AI as an Enterprise Wikibase Knowledge Graph Q&A System

de Ávila Mendes, R.; de Oliveira, D. J.; Garcia, V. H. F.

KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():35-42

Association for Computational Linguistics (ACL) 2024

Ref ID: 4398

Generative AI and Large Language Models are increasingly used in business contexts. One application involves natural language conversations contextualized by company data, which can be accomplished by Enterprise Knowledge Graphs, standardized representations of data. This paper outlines an architecture for implementation of an Enterprise Knowledge Graph using open-source Wikibase software. Additionally, it is presented a Knowledge Graph Q&A System powered by Generative AI. ©2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1865 - DeBellis 2023
Semantic Interpretation of BERT embeddings with Knowledge Graphs

De Bellis, A.; Biancofiore, G. M.; Anelli, V. W.; Narducci, F.; Di Noia, T.; Ragone, A.; Di Sciascio, E.

CEUR Workshop Proceedings 2023;3478():181-191

CEUR-WS 2023

Ref ID: 5300

Pretrained language models have transformed the way we process natural languages, enhancing the performance of related systems. BERT has played a pivotal role in revolutionizing the field of Natural Language Processing (NLP). However, the deep learning framework behind BERT lacks interpretability. Recent research has focused on explaining the knowledge BERT acquires from the textual sources used for pre-training its linguistic model. In this study, we analyze the latent vector space produced by BERT's context-aware word embeddings. Our aim is to determine whether certain areas of the BERT vector space have an explicit meaning related to a Knowledge Graph (KG). Using the Link Prediction (LP) task, we demonstrate the presence of explicit and meaningful regions of the BERT vector space. Moreover, we establish links between BERT's vector space and specific ontology concepts in the KG by learning classification patterns. To the best of our knowledge, this is the first attempt to interpret BERT's learned linguistic knowledge through a KG by relying on its pre-trained context-aware word embeddings. © 2023 CEUR-WS. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1889 - deSá 2024
Socio-cultural adapted chatbots: Harnessing Knowledge Graphs and Large Language Models for enhanced context awareness

de Sá, J. M. C.; Anastasiou, D.; Da Silveira, M.; Pruski, C.

TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop 2024;():21-27

Association for Computational Linguistics (ACL) 2024

Ref ID: 4662

Understanding the socio-cultural context is crucial in machine translation (MT). Although conversational AI systems and chatbots, in particular, are not designed for translation, they can be used for MT purposes. Yet, chatbots often struggle to identify any socio-cultural context during user interactions. In this paper, we highlight this challenge with real-world examples from popular chatbots. We advocate for the use of knowledge graphs as an external source of information that can potentially encapsulate socio-cultural contexts, aiding chatbots in enhancing translation. We further present a method to exploit external knowledge and extract contextual information that can significantly improve text translation, as evidenced by our interactions with these chatbots. © 2024 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2517 - Decker 2007
A Graphical Notation for Modeling Complex Events in Business Processes

Decker, G.; Grosskopf, A.; Barros, A.

11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007) 2007;():27-27

2007

DOI: 10.1109/EDOC.2007.41 · Ref ID: 6615

Using complex event rules for capturing dependencies between business processes is an emerging trend in enterprise information systems. In previous work we have identified a set of requirements for event extensions for business process modeling languages. This paper introduces a graphical language for modeling composite events in business processes, namely BEMN, that fulfills all these requirements. These include event conjunction, disjunction and inhibition as well as cardinality of events whose graphical expression can be factored into flow-oriented process modeling and event rule modeling. Formal semantics for the language are provided.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2083 - Dehghani 2014
An abstract methodology for developing knowledge management systems

Dehghani, R.; Ramsin, R.

2014 10th International Conference on Innovations in Information Technology (IIT) 2014;():110-115

2014

DOI: 10.1109/INNOVATIONS.2014.6987572 · Ref ID: 6066

Powerful organizations are those that manage their power factors efficiently; organizational resources are considered vital power factors, and Knowledge is one of the most important resources to manage. There is no universally accepted Knowledge Management (KM) process, but it is known that establishing the appropriate knowledge flows in the organization is the main goal of organizational KM. A Knowledge Management System (KMS) is an information system which supports the KM process, mainly by providing the required knowledge and enhancing its flow. Organizations increasingly feel the need for appropriate methodologies for developing their target KMSs. However, existing KMS development methodologies are not comprehensive enough to satisfy all organizational needs. In this paper, we propose an abstract KMS development methodology which alleviates the weaknesses of existing methodologies while reusing their strengths. Method engineers can develop concrete methodologies by instantiating the proposed abstract methodology and adding the necessary detail, thus producing bespoke methodologies which are best suited to organizational needs.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3767 - Deng 2023
PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model

Deng, Cheng; Tong, Bo; Fu, Luoyi; Ding, Jiaxin; Cao, Dexing; Wang, Xinbing; Zhou, Chenghu

arXiv 2023;():

2023

Ref ID: 7668

In the research of end-to-end dialogue systems, using real-world knowledge to generate natural, fluent, and human-like utterances with correct answers is crucial. However, domain-specific conversational dialogue systems may be incoherent and introduce erroneous external information to answer questions due to the out-of-vocabulary issue or the wrong knowledge from the parameters of the neural network. In this work, we propose PK-Chat, a Pointer network guided Knowledge-driven generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs. The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge. Moreover, based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences. Finally, an academic dialogue benchmark is constructed to evaluate the quality of dialogue systems in academic scenarios and the source code is available online.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#2149 - Deng 2023
An Artificial Intelligence Model Recommendation Method for Power Dispatching Scenario Based on Knowledge Graph and Scene Label Matching

Deng, Y.; Xiao, D.; Yu, F.; Zhang, H.

2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2023;11():1151-1155

2023

DOI: 10.1109/ITAIC58329.2023.10409033 · Ref ID: 6024

In the field of power dispatching, more and more tasks have adopted artificial intelligence solutions, and related research and literature are also showing an exponential explosive growth. In order to solve the problem of artificial intelligence information overload in power dispatching scenario, and to facilitate researchers with experience in power dispatching but lacking experience in artificial intelligence to use related algorithm models more effectively and conveniently in their work, this paper constructs a domain knowledge graph of artificial intelligence models for power dispatching scenario, and proposes an artificial intelligence model recommendation method based on knowledge graph and scene label matching. According to the cosine similarity of scene labels, the artificial intelligence model mapped on the knowledge graph is recommended. This method can be applied to the actual power dispatching scenario to provide personalized artificial intelligence model recommendation for different scientific research tasks, which greatly improves the retrieval efficiency of relevant researchers.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3469 - Dernbach 2024
GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding

Dernbach, Stefan; Agarwal, Khushbu; Zuniga, Alejandro; Henry, Michael; Choudhury, Sutanay

arXiv 2024;():

2024

Ref ID: 8088

Integrating large language models (LLMs) with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While large language models excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no–such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LAnguage Models (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-language model's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1903 - Deshpande 2022
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

Deshpande, A.; Ruiter, D.; Mosbach, M.; Klakow, D.

WOAH 2022 - 6th Workshop on Online Abuse and Harms, Proceedings of the Workshop 2022;():67-78

Association for Computational Linguistics (ACL) 2022

Ref ID: 5457

Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models. However, many techniques rely on human-compiled lists of bias terms, which are expensive to create and are limited in coverage. In this study, we present a fully data-driven pipeline for generating a knowledge graph (KG) of cultural knowledge and stereotypes. Our resulting KG covers 5 religious groups and 5 nationalities and can easily be extended to include more entities. Our human evaluation shows that the majority (59.2%) of non-singleton entries are coherent and complete stereotypes. We further show that performing intermediate masked language model training on the verbalized KG leads to a higher level of cultural awareness in the model and has the potential to increase classification performance on knowledge-crucial samples on a related task, i.e., hate speech detection. © 2022 Association for Computational Linguistics.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#3206 - Díaz 2024
Automatic knowledge-graph creation from historical documents: The Chilean dictatorship as a case study

Díaz, Camila; Dunstan, Jocelyn; Etcheverry, Lorena; Fonck, Antonia; Grez, Alejandro; Mery, Domingo; Reutter, Juan; Rojas, Hugo

arXiv 2024;():

2024

Ref ID: 8552

We present our results regarding the automatic construction of a knowledge graph from historical documents related to the Chilean dictatorship period (1973-1990). Our approach consists on using LLMs to automatically recognize entities and relations between these entities, and also to perform resolution between these sets of values. In order to prevent hallucination, the interaction with the LLM is grounded in a simple ontology with 4 types of entities and 7 types of relations. To evaluate our architecture, we use a gold standard graph constructed using a small subset of the documents, and compare this to the graph obtained from our approach when processing the same set of documents. Results show that the automatic construction manages to recognize a good portion of all the entities in the gold standard, and that those not recognized are mostly explained by the level of granularity in which the information is structured in the graph, and not because the automatic approach misses an important entity in the graph. Looking forward, we expect this report will encourage work on other similar projects focused on enhancing research in humanities and social science, but we remark that better evaluation metrics are needed in order to accurately fine-tune these types of architectures.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#2346 - Dietrich 2014
Distributed management and representation of data and context in robotic applications

Dietrich, A.; Zug, S.; Mohammad, S.; Kaiser, J.

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems 2014;():1133-1140

2014

DOI: 10.1109/IROS.2014.6942700 · Ref ID: 6755

The traditional, isolated data handling in sensor-actuator systems does not fulfill the requirements of robots that need to interact with their smart environment. Consequently, we have to develop new mechanisms for adaptive data and context handling. We firstly investigate what types of data are present within smart environments and how they can be classified and organized. Only if the available data can be structured, it can be queried and thus put into context. This is important because the variety of data and possible interpretations is tremendous, ranging from measurement values, sensor and robot descriptions/states/commands, to environmental data, such as positions, maps, spatial relations, etc. To cope with this diversity, we developed a solution capable of storing and accessing data within a distributed environment by providing additional context information. Furthermore, we describe how this information can be assembled in a task-oriented manner. This enables robots to dynamically generate environmental abstractions by using data from different sources and also enables them to incorporate external sensor measurements.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1978 - Dietze 2023
Towards syntax-aware pretraining and prompt engineering for knowledge retrieval from large language models

Dietze, S.; Jabeen, H.; Kallmeyer, L.; Linzbach, S.

CEUR Workshop Proceedings 2023;3577():

CEUR-WS 2023

Ref ID: 5080

The ability to access relational knowledge from LLM parameters, known as relational knowledge retrieval (rKR), is considered a critical factor in their capacity to comprehend and interpret natural language. However, the role of syntax in this context has not been adequately explored. In this position paper, we hypothesize a close link between the accessibility of relational knowledge and syntax. We discuss related works and lay out a research agenda focused on rKR from self-supervised LLMs without or with minimal fine-tuning and aiming at understanding the impact of syntax on rKR. This involves examining biases, factors affecting result reliability and robustness, and analyzing the effect of syntactic features in training corpora on rKR. We argue that rKR can be improved through syntax-aware pretraining and prompt engineering, and propose a dedicated research agenda geared toward exploring the impact of syntax on knowledge retrieval. © 2023 CEUR-WS. All rights reserved.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#2635 - Dillon 2001
Lightweight analysis of operational specifications using inference graphs

Dillon, L. K.; Stirewalt, R. E. K.

Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001 2001;():57-67

2001

DOI: 10.1109/ICSE.2001.919081 · Ref ID: 6796

The Amalia framework generates lightweight components that automate the analysis of operational specifications and designs. A key concept is the step analyzer, which enables Amalia to automatically tailor high-level analyses, such as behavior simulation and model checking, to different specification languages and representations. A step analyzer uses a new abstraction, called an inference graph, for the analysis. It creates and evaluates an inference graph on-the-fly during a top-down traversal of a specification to deduce the specification's local behaviors (called steps). The nodes of an inference graph directly reify the rules in an operational semantics, enabling Amalia to automatically generate a step analyzer from an operational description of a notation's semantics. Inference graphs are a clean abstraction that can be formally defined. The paper provides a detailed but informal introduction to inference graphs. It uses example specifications written in LOTOS for purposes of illustration.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#519 - Ding 2024
Leveraging Chain-of-Thought to Enhance Stance Detection with Prompt-Tuning

Ding, D. J.; Fu, X. H.; Peng, X. J.; Fan, X. M.; Huang, H.; Zhang, B. W.

Mathematics 2024;12(4):16

2024

DOI: 10.3390/math12040568 · Ref ID: 3783

Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly focus on sentence-level classification techniques. Recent research has shown that the integration of background knowledge can significantly improve stance detection performance. Despite the significant improvement achieved by knowledge-enhanced methods, applying these techniques in real-world scenarios remains challenging for several reasons. Firstly, existing methods often require the use of complex attention mechanisms to filter out noise and extract relevant background knowledge, which involves significant annotation efforts. Secondly, knowledge fusion mechanisms typically rely on fine-tuning, which can introduce a gap between the pre-training phase of pre-trained language models (PLMs) and the downstream stance detection tasks, leading to the poor prediction accuracy of the PLMs. To address these limitations, we propose a novel prompt-based stance detection method that leverages the knowledge acquired using the chain-of-thought method, which we refer to as PSDCOT. The proposed approach consists of two stages. The first stage is knowledge extraction, where instruction questions are constructed to elicit background knowledge from a VLPLM. The second stage is the multi-prompt learning network (M-PLN) for knowledge fusion, which learns model performance based on the background knowledge and the prompt learning framework. We evaluated the performance of PSDCOT on publicly available benchmark datasets to assess its effectiveness in improving stance detection performance. The results demonstrate that the proposed method achieves state-of-the-art results in in-domain, cross-target, and zero-shot learning settings.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#3173 - Ding 2024
3DS: Decomposed Difficulty Data Selection's Case Study on LLM Medical Domain Adaptation

Ding, Hongxin; Fang, Yue; Zhu, Runchuan; Jiang, Xinke; Zhang, Jinyang; Xu, Yongxin; Chu, Xu; Zhao, Junfeng; Wang, Yasha

arXiv 2024;():

2024

Ref ID: 8704

Large Language Models(LLMs) excel in general tasks but struggle in specialized domains like healthcare due to limited domain-specific knowledge.Supervised Fine-Tuning(SFT) data construction for domain adaptation often relies on heuristic methods, such as GPT-4 annotation or manual data selection, with a data-centric focus on presumed diverse, high-quality datasets. However, these methods overlook the model's inherent knowledge distribution, introducing noise, redundancy, and irrelevant data, leading to a mismatch between the selected data and the model's learning task, resulting in suboptimal performance. To address this, we propose a two-stage model-centric data selection framework, Decomposed Difficulty Data Selection (3DS), which aligns data with the model's knowledge distribution for optimized adaptation. In Stage1, we apply Prompt-Driven Data Selection via Explicit Alignment, where the the model filters irrelevant or redundant data based on its internal knowledge. In Stage2, we perform Decomposed Difficulty Data Selection, where data selection is guided by our defined difficulty decomposition, using three metrics: Instruction Understanding, Response Confidence, and Response Correctness. Additionally, an attention-based importance weighting mechanism captures token importance for more accurate difficulty calibration. This two-stage approach ensures the selected data is not only aligned with the model's knowledge and preferences but also appropriately challenging for the model to learn, leading to more effective and targeted domain adaptation. In the case study of the medical domain, our extensive experiments on real-world healthcare datasets demonstrate the superiority of 3DS over exisiting methods in accuracy by over 5.29%. Our dataset and code will be open-sourced at https://anonymous.4open.science/r/3DS-E67F.

Xinchen voted
Davis voted
Final decision
What was the agreed final decision?

#3203 - Ding 2024
Automated Construction of Theme-specific Knowledge Graphs

Ding, Linyi; Zhou, Sizhe; Xiao, Jinfeng; Han, Jiawei

arXiv 2024;():

2024

Ref ID: 8262

Despite widespread applications of knowledge graphs (KGs) in various tasks such as question answering and intelligent conversational systems, existing KGs face two major challenges: information granularity and deficiency in timeliness. These hinder considerably the retrieval and analysis of in-context, fine-grained, and up-to-date knowledge from KGs, particularly in highly specialized themes (e.g., specialized scientific research) and rapidly evolving contexts (e.g., breaking news or disaster tracking). To tackle such challenges, we propose a theme-specific knowledge graph (i.e., ThemeKG), a KG constructed from a theme-specific corpus, and design an unsupervised framework for ThemeKG construction (named TKGCon). The framework takes raw theme-specific corpus and generates a high-quality KG that includes salient entities and relations under the theme. Specifically, we start with an entity ontology of the theme from Wikipedia, based on which we then generate candidate relations by Large Language Models (LLMs) to construct a relation ontology. To parse the documents from the theme corpus, we first map the extracted entity pairs to the ontology and retrieve the candidate relations. Finally, we incorporate the context and ontology to consolidate the relations for entity pairs. We observe that directly prompting GPT-4 for theme-specific KG leads to inaccurate entities (such as "two main types" as one entity in the query result) and unclear (such as "is", "has") or wrong relations (such as "have due to", "to start"). In contrast, by constructing the theme-specific KG step by step, our model outperforms GPT-4 and could consistently identify accurate entities and relations. Experimental results also show that our framework excels in evaluations compared with various KG construction baselines.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1351 - Ding 2023
Generative Semantic Modeling for Structured Data Source with Large Language Model

Ding, N.; Mayer, W.; Geng, Y.; Duan, Y.; Feng, Z.

Proceedings - 2023 IEEE International Conference on High Performance Computing and Communications, Data Science and Systems, Smart City and Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2023 2023;():1148-1152

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00164 · Ref ID: 4947

The paper introduces a generative semantic model for representing human knowledge in a way that enables computer understanding and reasoning. The current approach to semantic modeling involves mapping between the space of plausible semantic models and the provided data source. However, this approach has limitations, as the score functions used to search for the best candidate semantic model are either trained on a specific integration knowledge graph or rely on manually designed features. To address these limitations, the authors propose a new approach that combines an encoder made with a pre-trained large language model (LLM) with a graph decoder customized to generate semantics. The encoder-decoder system is designed to be trained on knowledge graphs, and the authors introduce an algorithm to generate training samples from the big knowledge graph by decomposing training samples into construction actions using a method similar to the transition system of the Syntax Parser. The proposed method is novel, as it is the first time a generative method has been applied to the semantic modeling task, empowered with an LLM, and trained on knowledge graphs to achieve better performance on standard benchmarks than in past work. In conclusion, the proposed generative semantic model offers a promising new approach to representing and organizing human knowledge in a more generalizable way, using a combination of a pre-trained LLM and a customized graph decoder trained on knowledge graphs. The approach has shown improved performance on standard benchmarks and has the potential to advance the field of semantic modeling. © 2023 IEEE.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#2007 - Ding 2023
A Unified Knowledge Graph Augmentation Service for Boosting Domain-specific NLP Tasks

Ding, R.; Han, X.; Wang, L.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():353-369

Association for Computational Linguistics (ACL) 2023

DOI: 10.18653/v1/2023.findings-acl.24 · Ref ID: 5173

By focusing the pre-training process on domain-specific corpora, some domain-specific pre-trained language models (PLMs) have achieved state-of-the-art results. However, it is under-investigated to design a unified paradigm to inject domain knowledge in the PLM fine-tuning stage. We propose KnowledgeDA, a unified domain language model development service to enhance the task-specific training procedure with domain knowledge graphs. Given domain-specific task texts input, KnowledgeDA can automatically generate a domain-specific language model following three steps: (i) localize domain knowledge entities in texts via an embedding-similarity approach; (ii) generate augmented samples by retrieving replaceable domain entity pairs from two views of both knowledge graph and training data; (iii) select high-quality augmented samples for fine-tuning via confidence-based assessment. We implement a prototype of KnowledgeDA to learn language models for two domains, healthcare and software development. Experiments on domain-specific text classification and QA tasks verify the effectiveness and generalizability of KnowledgeDA. © 2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#2050 - Ding 2024
zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models

Ding, Z.; Cai, H.; Wu, J.; Ma, Y.; Liao, R.; Xiong, B.; Tresp, V.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():1877-1895

Association for Computational Linguistics (ACL) 2024

Ref ID: 4562

Modeling evolving knowledge over temporal knowledge graphs (TKGs) has become a heated topic. Various methods have been proposed to forecast links on TKGs. Most of them are embedding-based, where hidden representations are learned to represent knowledge graph (KG) entities and relations based on the observed graph contexts. Although these methods show strong performance on traditional TKG forecasting (TKGF) benchmarks, they face a strong challenge in modeling the unseen zero-shot relations that have no prior graph context. In this paper, we try to mitigate this problem as follows. We first input the text descriptions of KG relations into large language models (LLMs) for generating relation representations, and then introduce them into embedding-based TKGF methods. LLM-empowered representations can capture the semantic information in the relation descriptions. This makes the relations, whether seen or unseen, with similar semantic meanings stay close in the embedding space, enabling TKGF models to recognize zero-shot relations even without any observed graph context. Experimental results show that our approach helps TKGF models to achieve much better performance in forecasting the facts with previously unseen relations, while still maintaining their ability in link forecasting regarding seen relations. © 2024 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1403 - Ding 2024
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

Ding, Z.; Huang, W.; Liang, J.; Yang, D.; Xiao, Y.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():8890-8901

European Language Resources Association (ELRA) 2024

Ref ID: 4559

Relation triple extraction, which outputs a set of triples from long sentences, plays a vital role in knowledge acquisition. Large language models can accurately extract triples from simple sentences through few-shot learning or fine-tuning when given appropriate instructions. However, they often miss out when extracting from complex sentences. In this paper, we design an evaluation-filtering framework that integrates large language models with small models for relational triple extraction tasks. The framework includes an evaluation model that can extract related entity pairs with high precision. We propose a simple labeling principle and a deep neural network to build the model, embedding the outputs as prompts into the extraction process of the large model. We conduct extensive experiments to demonstrate that the proposed method can assist large language models in obtaining more accurate extraction results, especially from complex sentences containing multiple relational triples. Our evaluation model can also be embedded into traditional extraction models to enhance their extraction precision from complex sentences. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2622 - Djemai 2024
Knowledge-based Reactive Planning and Re-planning – A Case-Study Approach

Djemai, R.; Vassilev, V.; Ouazzane, K.; Dey, M.

2024 IEEE Conference on Artificial Intelligence (CAI) 2024;():770-775

2024

DOI: 10.1109/CAI59869.2024.00147 · Ref ID: 6155

When a disaster strikes, man-made or natural, evacuation plans are put under immediate constraints, including topological, temporal, and spontaneously occurring events such as fire, smoke and obstacles introducing bottlenecks and impeding ingress and egress. Planning for uncertainties arising from indoor evacuations can be complex as there’s a fine balance to strike between a too-detailed plan and one that’s too vague. Such constraints apply to office and residential buildings, airports, mining sites, stadiums, ships, etc. Although some indoor spatial models have been developed, many are complex, and their applicability is non-universal. This paper proposes an innovative approach that harnesses the power of the Semantic Web Rule Language (SWRL) based on Web Ontology Language (OWL) to enhance existing evacuation planning methods through data-rich modelling. The OWL ontology serves as a formal representation of real-world concepts, their relationships, and properties. To demonstrate its application, the ontology is implemented in a case study involving London Metropolitan University’s Tower Building, and its design is elucidated in this paper.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1240 - Dobriy 2024
Employing RAG to Create a Conference Knowledge Graph from Text

Dobriy, D.

CEUR Workshop Proceedings 2024;3747():18

CEUR-WS 2024

Ref ID: 4345

In this paper, we present Semantic Observer, a platform that 1) defines a FAIR Conference Ontology for describing academic conferences, 2) presents an RAG architecture that constructs a Conference Knowledge Graph based on this ontology, 3) evaluates the architecture on a corpus of latest available CORE conference websites. The Conference Ontology models key entities such as conferences, workshops and challenges, organizer and programme committees, calls for papers and proposals as well as major deadlines and relevant topics. In the evaluation, we compare the performance of three leading Large Language Models: GPT-4 Turbo and Claude 3 Opus - in supporting the Knowledge Graph construction from text. The best-performing RAG architecture is then implemented in Semantic Observer and available in a SPARQL endpoint to make up-to-date conference information FAIR: findable, accessible, interoperable and reusable. © 2024 Copyright for this paper by its authors.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1651 - Dong 2024
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering

Dong, J.; Zhang, Q.; Zhou, H.; Zha, D.; Zheng, P.; Huang, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():2417-2429

Association for Computational Linguistics (ACL) 2024

Ref ID: 4316

Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs). While several attempts have been proposed to leverage large language models (LLMs) as an implicit knowledge source, it remains challenging since LLMs may generate hallucinations. Moreover, multiple knowledge sources, e.g., images, KGs and LLMs, cannot be readily aligned for complex scenarios. To tackle these, we present a novel modality-aware integration with LLMs for KVQA (MAIL). It carefully leverages multimodal knowledge for both image understanding and knowledge reasoning. Specifically, (i) we propose a two-stage prompting strategy with LLMs to densely embody the image into a scene graph with detailed visual features; (ii) We construct a coupled concept graph by linking the mentioned entities with external facts. (iii) A tailored pseudo-siamese graph medium fusion is designed for sufficient multimodal fusion. We utilize the shared mentioned entities in two graphs as mediums to bridge a tight inter-modal exchange, while maximally preserving insightful intra-modal learning by constraining the fusion within mediums. Extensive experiments show the superiority of MAIL. © 2024 Association for Computational Linguistics.

Kwesi voted
yuexi voted
Final decision
What was the agreed final decision?

#3157 - Dou 2024
ShennongMGS: An LLM-based Chinese Medication Guidance System

Dou, Yutao; Huang, Yuwei; Zhao, Xiongjun; Zou, Haitao; Shang, Jiandong; Lu, Ying; Yang, Xiaolin; Xiao, Jian; Peng, Shaoliang

ACM Trans. Manage. Inf. Syst. 2024;():

2024

DOI: 10.1145/3658451 · Ref ID: 7210

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#757 - Du 2024
Semantic-enhanced reasoning question answering over temporal knowledge graphs

Du, C. Y.; Li, X. G.; Li, Z. Y.

J. Intell. Inf. Syst. 2024;62(3):859-881

2024

DOI: 10.1007/s10844-024-00840-5 · Ref ID: 3191

Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called "Semantic-Enhanced Reasoning Question Answering" (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3543 - Du 2024
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering

Du, Haowei; Zhao, Dongyan

arXiv 2024;():

2024

Ref ID: 8555

Recent works have attempted to integrate external knowledge into LLMs to address the limitations and potential factual errors in LLM-generated content. However, how to retrieve the correct knowledge from the large amount of external knowledge imposes a challenge. To this end, we empirically observe that LLMs have already encoded rich knowledge in their pretrained parameters and utilizing these internal knowledge improves the retrieval of external knowledge when applying them to knowledge-intensive tasks. In this paper, we propose a new internal and external knowledge interactive refinement paradigm dubbed IEKR to utilize internal knowledge in LLM to help retrieve relevant knowledge from the external knowledge base, as well as exploit the external knowledge to refine the hallucination of generated internal knowledge. By simply adding a prompt like 'Tell me something about' to the LLMs, we try to review related explicit knowledge and insert them with the query into the retriever for external retrieval. The external knowledge is utilized to complement the internal knowledge into input of LLM for answers. We conduct experiments on 3 benchmark datasets in knowledge-intensive question answering task with different LLMs and domains, achieving the new state-of-the-art. Further analysis shows the effectiveness of different modules in our approach.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#415 - Du 2023
KLDP:A Data Profiling Technique Based on Knowledge Graph and Large Language Modeling

Du, J. H.; Yin, H.

IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) / BigDataSE Conference / CSE Conference / EUC Conference / ISCI Conference 2023;():2333-2340

Exeter, ENGLAND Ieee Computer Soc 2023

DOI: 10.1109/TrustCom60117.2023.00329 · Ref ID: 3229

The explosive growth of medical data has perfected the establishment of patients' personal health records and provided favorable conditions for smart healthcare, but its fragmentation also brings challenges to patient management. Mainstream research focuses on utilizing medical data to construct disease knowledge graphs to assist patient management, but does not effectively manage massive patient data. In order to make full use of patient data and facilitate the circulation of patient data elements, we propose a new patient sketching technique, KLDP. it constructs knowledge graphs through pre-training techniques, effectively manages patient data based on patients' personal health records and medical history information throughout the treatment cycle, and elementalizes patient data, providing new ideas and implementation solutions for patient management.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#291 - Du 2024
From Static to Dynamic: Knowledge Metabolism for Large Language Models

Du, M. Z.; Luu, A. T.; Ji, B.; Ng, S. K.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():23784-23786

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3567

The immense parameter space of Large Language Models (LLMs) endows them with superior knowledge retention capabilities, allowing them to excel in a variety of natural language processing tasks. However, it also instigates difficulties in consistently tuning LLMs to incorporate the most recent knowledge, which may further lead LLMs to produce inaccurate and fabricated content. To alleviate this issue, we propose a knowledge metabolism framework for LLMs, which proactively sustains the credibility of knowledge through an auxiliary memory component and directly delivers pertinent knowledge for LLM inference, thereby suppressing hallucinations caused by obsolete internal knowledge during the LLM inference process. Benchmark experiments demonstrate DynaMind's effectiveness in overcoming this challenge. The code and demo of DynaMind are available at: https://github.com/Elfsong/DynaMind.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#2048 - Du 2024
ZhuJiu-Knowledge: A Fairer Platform for Evaluating Multiple Knowledge Types in Large Language Models

Du, P.; Liang, S.; Zhang, B.; Cao, P.; Chen, Y.; Liu, K.; Zhao, J.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;3():194-206

Association for Computational Linguistics (ACL) 2024

Ref ID: 4491

The swift advancement in large language models (LLMs) has heightened the importance of model evaluations. LLMs have acquired a substantial amount of knowledge, and evaluating the knowledge of these LLMs is crucial. To address this, we introduce the ZhuJiu-Knowledge benchmark which carefully considers the following factors: (1) For knowledge scope, we concentrate on three domains: commonsense knowledge, world knowledge, language knowledge, which comes from ATOMIC, Conceptnet, Wikidata, and Wordnet. (2) For data construction, to prevent data contamination, we utilize knowledge derived from corpora and knowledge graphs to formulate novel questions that are ensured not to appear in the training corpus. A multitude of prompts is purposefully devised to mitigate the impact of prompt design on evaluation and to further analyze the LLMs’ sensitivity to various prompts. (3) For evaluation criteria, we propose a novel voting methodology for assessing generative text, aligning the model’s evaluation with human preferences to reduce biases inherent in individual model assessments. We evaluate 14 current mainstream LLMs and conduct a comprehensive discussion and analysis of their results. The ZhuJiu-Knowledge benchmark and open-participation leaderboard are publicly released at http://zhujiu-knowledge.top/ and we also provide a demo video at https://youtu.be/QJp4qlEHVH8. © 2024 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#698 - Du 2015
Ranking Web Page with Path Trust Knowledge Graph

Du, Y. J.; Hu, Q.; Li, X. L.; Chen, X. L.; Li, C. X.

5th International Conference on Intelligence Science and Big Data Engineering (IScIDE) 2015;9243():66-75

Suzhou, PEOPLES R CHINA Springer International Publishing Ag 2015

DOI: 10.1007/978-3-319-23862-3_7 · Ref ID: 3180

How to find and discover useful information from Internet is a real challenge in information retrieval (IR) and search engines (SE). In this paper, we propose and construct Path Trust Knowledge Graph PTKG model for assigning priority values to the unvisited web pages. For a given user specific topic t, its PTKG contains five parts: (1) The context graph G(t) = (V, E), where V is the crawled history web page set and E includes the hyper link set among the history web pages; (2) Retrieving knowledge implied in the paths among these web pages and finding their lengths; (3) Building the trust degrees among the web pages; (4) Constructing topic specific language model and general language model by using the trust degrees; (5) Assigning the priority values of web pages for ranking them. Finally, we perform an experimental comparison among our proposed PTKG approach with the classic LCG and RCG. As a result, our method outperforms LCG and RCG.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#511 - Duan 2021
Learning Numeracy: A Simple Yet Effective Number Embedding Approach Using Knowledge Graph

Duan, H. Y.; Yang, Y.; Tam, K. Y.

Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2021;():2597-2602

Punta Cana, DOMINICAN REP Assoc Computational Linguistics-Acl 2021

Ref ID: 2971

Numeracy plays a key role in natural language understanding. However, existing NLP approaches, either traditional word2vec approach or contextualized transformer-based language models, fail to learn numeracy. As the result, the performance of these models is limited when they are applied to number-intensive applications in clinical and financial domains. In this work, we propose a simple number embedding approach based on knowledge graph. We construct a knowledge graph consisting of number entities and magnitude relations. Knowledge graph embedding method is then applied to obtain number vectors. Our approach is easy to implement, and experiment results on various numeracy-related NLP tasks demonstrate the effectiveness and efficiency of our method.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#773 - Duan 2023
Simple Knowledge Graph Completion Model Based on Differential Negative Sampling and Prompt Learning

Duan, L.; Wang, J.; Luo, B.; Sun, Q.

Information 2023;14(8):15

2023

DOI: 10.3390/info14080450 · Ref ID: 3067

Knowledge graphs (KGs) serve as a crucial resource for numerous artificial intelligence tasks, significantly contributing to the advancement of the AI field. However, the incompleteness of existing KGs hinders their effectiveness in practical applications. Consequently, researchers have proposed the task of KG completion. Currently, embedding-based techniques dominate the field as they leverage the structural information within KGs to infer and complete missing parts. Nonetheless, these methods exhibit limitations. They are limited by the quality and quantity of structural information and are unable to handle the missing entities in the original KG. To overcome these challenges, researchers have attempted to integrate pretrained language models and textual data to perform KG completion. This approach utilizes the definition statements and description text of entities within KGs. The goal is to compensate for the latent connections that are difficult for traditional methods to obtain. However, text-based methods still lag behind embedding-based models in terms of performance. Our analysis reveals that the critical issue lies in the selection process of negative samples. In order to enhance the performance of the text-based methods, various types of negative sampling methods are employed in this study. We introduced prompt learning to fill the gap between the pre-training language model and the knowledge graph completion task, and to improve the model reasoning level. Simultaneously, a ranking strategy based on KG structural information is proposed to utilize KG structured data to assist reasoning. The experiment results demonstrate that our model exhibits strong competitiveness and outstanding inference speed. By fully exploiting the internal structural information of KGs and external relevant descriptive text resources, we successfully elevate the performance levels of KG completion tasks across various metrics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3883 - Dunn 2022
Structured information extraction from complex scientific text with fine-tuned large language models

Dunn, Alexander; Dagdelen, John; Walker, Nicholas; Lee, Sanghoon; Rosen, Andrew S.; Ceder, Gerbrand; Persson, Kristin; Jain, Anubhav

arXiv 2022;():

2022

Ref ID: 7624

Intelligently extracting and linking complex scientific information from unstructured text is a challenging endeavor particularly for those inexperienced with natural language processing. Here, we present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction for complex hierarchical information in scientific text. The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts (inputs) and completions (outputs). Information is extracted either from single sentences or across sentences in abstracts/passages, and the output can be returned as simple English sentences or a more structured format, such as a list of JSON objects. We demonstrate that LLMs trained in this way are capable of accurately extracting useful records of complex scientific knowledge for three representative tasks in materials chemistry: linking dopants with their host materials, cataloging metal-organic frameworks, and general chemistry/phase/morphology/application information extraction. This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text. An online demo is available at http://www.matscholar.com/info-extraction.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1989 - Dunn 2024
Transforming Generative Large Language Models' Limitations into Strengths using Gestalt: A Synergetic Approach to Mathematical Problem-Solving with Computational Engines

Dunn, C. W.; Tonekaboni, N. H.

Proceedings of the Annual Hawaii International Conference on System Sciences 2024;():5185-5194

IEEE Computer Society 2024

Ref ID: 4446

This paper presents an innovative approach, known as Gestalt1, to enhance the mathematical problem-solving capabilities of Generative Large Language Models (GLLMs) while addressing their inherent limitations. Recognizing the inherent structure and discerning strength of GLLMs, the core of our approach strategically offloads computations, deterministic questions, and knowledge retrieval to external tools such as Wolfram Alpha and Python REPL. This critical augmentation not only mitigates GLLMs' variable reliability in these areas but also fortifies their innate strength-understanding the underlying structure of the problems at hand. With this novel implementation, GLLMs can harness the potential of external systems through well-structured queries, enabling them to make significant strides in problem-solving. In a preliminary evaluation, the Gestalt system demonstrates exceptional performance on a portion of the MATH benchmark dataset, achieving a state-of-the-art accuracy of 59.00%. In comparison, GPT-4 achieves an accuracy of 53.9% on the identical dataset. Through our augmentation approach, we aim to transform the limitations of GLLMs into their strengths, opening up exciting new possibilities not only in advanced mathematical problem-solving but also in various deterministic tasks such as medical diagnosis. © 2024 IEEE Computer Society. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1415 - Dura 2002
Information retrieval based on explicit knowledge representation

Dura, E.; Drejak, M.

CEUR Workshop Proceedings 2002;1168():

CEUR-WS 2002

Ref ID: 5840

The tool which we tested in the present monolingual retrieval task, Lexware®, is based on explicit knowledge representation not on statistic language modeling. In the present task Lexware® indexing seems to be satisfactory while its query builder is not. The system has been tested extensively on indexing of Swedish parliamentary debates with very good results. We are happy that Swedish is finally introduced into CLEF, unfortunately the present test suite is not reliable. Swedish parliamentary debates may perhaps be used instead. They are many, constantly growing and they are thoroughly indexed manually with keywords chosen from a thesaurus of about 4000 items. Copyright © 2002 for the individual papers by the papers' authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#629 - Durmaz 2024
An ontology-based text mining dataset for extraction of process-structure-property entities

Durmaz, A. R.; Thomas, A.; Mishra, L.; Murthy, R. N.; Straub, T.

Sci. Data 2024;11(1):16

2024

DOI: 10.1038/s41597-024-03926-5 · Ref ID: 3285

While large language models learn sound statistical representations of the language and information therein, ontologies are symbolic knowledge representations that can complement the former ideally. Research at this critical intersection relies on datasets that intertwine ontologies and text corpora to enable training and comprehensive benchmarking of neurosymbolic models. We present the MaterioMiner dataset and the linked materials mechanics ontology where ontological concepts from the mechanics of materials domain are associated with textual entities within the literature corpus. Another distinctive feature of the dataset is its eminently fine-grained annotation. Specifically, 179 distinct classes are manually annotated by three raters within four publications, amounting to 2191 entities that were annotated and curated. Conceptual work is presented for the symbolic representation of causal composition-process-microstructure-property relationships. We explore the annotation consistency between the three raters and perform fine-tuning of pre-trained language models to showcase the feasibility of training named entity recognition models. Reusing the dataset can foster training and benchmarking of materials language models, automated ontology construction, and knowledge graph generation from textual data.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3973 - Egami 2024
VHAKG: A Multi-modal Knowledge Graph Based on Synchronized Multi-view Videos of Daily Activities

Egami, Shusaku; Ugai, Takahiro; Htun, Swe Nwe Nwe; Fukuda, Ken

arXiv 2024;():

2024

Ref ID: 8563

Multi-modal knowledge graphs (MMKGs), which ground various non-symbolic data (e.g., images and videos) into symbols, have attracted attention as resources enabling knowledge processing and machine learning across modalities. However, the construction of MMKGs for videos consisting of multiple events, such as daily activities, is still in the early stages. In this paper, we construct an MMKG based on synchronized multi-view simulated videos of daily activities. Besides representing the content of daily life videos as event-centric knowledge, our MMKG also includes frame-by-frame fine-grained changes, such as bounding boxes within video frames. In addition, we provide support tools for querying our MMKG. As an application example, we demonstrate that our MMKG facilitates benchmarking vision-language models by providing the necessary vision-language datasets for a tailored task.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1638 - Eggert 2023
Memory Net: Generalizable Common-Sense Reasoning over Real-World Actions and Objects

Eggert, J.; Deigmoeller, J.; Smirnov, P.; Takeuchi, J.; Richter, A.

International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings 2023;2():182-189

Science and Technology Publications, Lda 2023

DOI: 10.5220/0012182300003598 · Ref ID: 5078

In this paper, we explore how artificial agents (AAs) can understand and reason about so called”action patterns” within real-world settings. Essentially, we want AAs to determine which tools fit specific actions, and which actions can be executed with certain tools, objects or agents, based on real-world situations. To achieve this, we utilize a comprehensive Knowledge Graph, called”Memory Net” filled with interconnected everyday concepts, common actions, and environmental data. Our approach involves an inference technique that harnesses semantic proximity through subgraph matching. Comparing our approach against human responses and a state-of-the-art natural language model based machine learning approach in a home scenario, our Knowledge Graph method demonstrated strong generalization capabilities, suggesting its promise in dynamic, incremental and interactive real world settings. Copyright © 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0).

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2500 - Eisermann 2021
Generalization in Multimodal Language Learning from Simulation

Eisermann, A.; Lee, J. H.; Weber, C.; Wermter, S.

2021 International Joint Conference on Neural Networks (IJCNN) 2021;():1-8

2021

DOI: 10.1109/IJCNN52387.2021.9534275 · Ref ID: 6357

Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution. Naturally, they perform well at generalizing within the limits of their target function, but they often fail to generalize outside of the explicitly learned feature space. It is therefore an open research topic whether and how neural network-based architectures can be deployed for systematic reasoning. Many studies have shown evidence for poor generalization, but they often work with abstract data or are limited to single-channel input. Humans, however, learn and interact through a combination of multiple sensory modalities, and rarely rely on just one. To investigate compositional generalization in a multimodal setting, we generate an extensible dataset with multimodal input sequences from simulation. We investigate the influence of the underlying training data distribution on compostional generalization in a minimal LSTM-based network trained in a supervised, time continuous setting. We find compositional generalization to fail in simple setups while improving with the number of objects, actions, and particularly with a lot of color overlaps between objects. Furthermore, multimodality strongly improves compositional generalization in settings where a pure vision model struggles to generalize.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1442 - Elena 2024
THE ISSUES OF CREATION OF MACHINE-UNDERSTANDABLE SMART STANDARDS BASED ON KNOWLEDGE GRAPHS

Elena, S.; Valeria, G.

Inform. Autom. 2024;23(4):969-988

2024

DOI: 10.15622/ia.23.4.2 · Ref ID: 4458

The development of digital transformation requires the widespread use of digital technologies in standardization documents. One of the goals is to create standards with machine-understandable content that will allow the use of digital documents at various stages of development and production without the need for a human operator. The purpose of this work is to describe an approach for creating and translating industry normative documents into a machine-understandable representation for their further use in software services and systems. There are three types of SMART standard content: machine-readable, machine-interpretable, and machine-understandable. Knowledge graphs are actively used to formalize data and knowledge when solving various problems. The new two-level approach is proposed for the creation and translation into a machine-understandable representation of regulatory documents as knowledge graphs. The approach defines two types of interpretation of a smart document (human readability and machine understandability) through two related formats: a graph, each semantic node of which represents text in a natural language, and a network of concepts and strict connections. Each node of a human-readable graph corresponds (in general) to a subtree of a machine-readable knowledge graph. As the basis for ensuring the transformation of one form of smart standard representation into another form, LLM models are used, supplemented by a specialized adapter obtained as a result of additional training using the Parameter-Efficient Fine-Tuning approach. Requirements have been established for a set of problem- and subject-oriented tools for generating knowledge graphs. The conceptual architecture of the system for supporting the solution of a set of problems based on knowledge graphs is shown, and the principles for implementing software components that work with smart knowledge for intelligent software services are established. © 2024 St. Petersburg Federal Research Center of the Russian Academy of Sciences. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3250 - Ershov 2023
A Case Study for Compliance as Code with Graphs and Language Models: Public release of the Regulatory Knowledge Graph

Ershov, Vladimir

arXiv 2023;():

2023

Ref ID: 7642

The paper presents a study on using language models to automate the construction of executable Knowledge Graph (KG) for compliance. The paper focuses on Abu Dhabi Global Market regulations and taxonomy, involves manual tagging a portion of the regulations, training BERT-based models, which are then applied to the rest of the corpus. Coreference resolution and syntax analysis were used to parse the relationships between the tagged entities and to form KG stored in a Neo4j database. The paper states that the use of machine learning models released by regulators to automate the interpretation of rules is a vital step towards compliance automation, demonstrates the concept querying with Cypher, and states that the produced sub-graphs combined with Graph Neural Networks (GNN) will achieve expandability in judgment automation systems. The graph is open sourced on GitHub to provide structured data for future advancements in the field.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1273 - Esmeir 2022
Entity Retrieval from Multilingual Knowledge Graphs

Esmeir, S.; Câmara, A.; Meij, E.

MRL 2022 - 2nd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop 2022;():1-15

Association for Computational Linguistics (ACL) 2022

Ref ID: 5326

Knowledge Graphs (KGs) are structured databases that capture real-world entities and their relationships. The task of entity retrieval from a KG aims at retrieving a ranked list of entities relevant to a given user query. While English-only entity retrieval has attracted considerable attention, user queries, as well as the information contained in the KG, may be represented in multiple-and possibly distinct-languages. Furthermore, KG content may vary between languages due to different information sources and points of view. Recent advances in language representation have enabled natural ways of bridging gaps between languages. In this paper, we, therefore, propose to utilise language models (LMs) and diverse entity representations to enable truly multilingual entity retrieval. We propose two approaches: (i) an array of monolingual retrievers and (ii) a single multilingual retriever trained using queries and documents in multiple languages. We show that while our approach is on par with the significantly more complex state-of-the-art method for the English task, it can be successfully applied to virtually any language with an LM. Furthermore, it allows languages to benefit from one another, yielding significantly better performance, both for low- and high-resource languages. © 2022 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#826 - Ezzabady 2024
Towards Generating High-Quality Knowledge Graphs by Leveraging Large Language Models

Ezzabady, M. K.; Ieng, F.; Khorashadizadeh, H.; Benamara, F.; Groppe, S.; Sahri, S.

29th International Conference on Applications of Natural Language to Information Systems (NLDB) 2024;14762():455-469

Univ Turin, Turin, ITALY Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-70239-6_31 · Ref ID: 3256

Knowledge graph creation requires relation extraction (RE) tools often trained on annotated data either manually or by distant supervision. Recent approaches operate at the model level to handle new domains with unseen relations, relying on transfer learning or generative approaches in few/zero-shot learning scenarios. In this paper, we adopt a different strategy by operating instead at the level of dataset creation. We, for the first time to the best of our knowledge, investigate the ability of prompt-based models to build high-quality RE datasets relying on GPT4 to extract triples from sentences. Our approach is further enhanced by linking our knowledge graph to Wikidata, a step that enriches our dataset and ensures its interoperability. This strategy has been successfully employed in two use cases: COVID and health relation extraction.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#678 - Faghihi 2024
Prompt2DeModel: Declarative Neuro-Symbolic Modeling with Natural Language

Faghihi, H. R.; Nafar, A.; Uszok, A.; Karimian, H.; Kordjamshidi, P.

18th International Conference on Neural-Symbolic Learning and Reasoning (NeSy) 2024;14980():315-327

Barcelona, SPAIN Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-71170-1_25 · Ref ID: 3719

This paper presents a conversational pipeline for crafting domain knowledge for complex neuro-symbolic models through natural language prompts. It leverages large language models to generate declarative programs in the DomiKnowS framework. The programs in this framework express concepts and their relationships as a graph in addition to logical constraints between them. The graph, later, can be connected to trainable neural models according to those specifications. Our proposed pipeline utilizes techniques like dynamic in-context demonstration retrieval, model refinement based on feedback from a symbolic parser, visualization, and user interaction to generate the tasks' structure and formal knowledge representation. This approach empowers domain experts, even those not well-versed in ML/AI, to formally declare their knowledge to be incorporated in customized neural models in the DomiKnowS framework.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#672 - Fan 2024
Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion

Fan, C. H.; Chen, Y. J.; Xue, J.; Kong, Y. H.; Tao, J. H.; Lv, Z.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():8380-8388

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3284

In recent years, knowledge graph completion (KGC) models based on pre-trained language model (PLM) have shown promising results. However, the large number of parameters and high computational cost of PLM models pose challenges for their application in downstream tasks. This paper proposes a progressive distillation method based on masked generation features for KGC task, aiming to significantly reduce the complexity of pre-trained models. Specifically, we perform pre-distillation on PLM to obtain high-quality teacher models, and compress the PLM network to obtain multi-grade student models. However, traditional feature distillation suffers from the limitation of having a single representation of information in teacher models. To solve this problem, we propose masked generation of teacher-student features, which contain richer representation information. Furthermore, there is a significant gap in representation ability between teacher and student. Therefore, we design a progressive distillation method to distill student models at each grade level, enabling efficient knowledge transfer from teachers to students. The experimental results demonstrate that the model in the pre-distillation stage surpasses the existing state-of-the-art methods. Furthermore, in the progressive distillation stage, the model significantly reduces the model parameters while maintaining a certain level of performance. Specifically, the model parameters of the lower-grade student model are reduced by 56.7% compared to the baseline.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3479 - Fan 2024
Graph Machine Learning in the Era of Large Language Models (LLMs)

Fan, Wenqi; Wang, Shijie; Huang, Jiani; Chen, Zhikai; Song, Yu; Tang, Wenzhuo; Mao, Haitao; Liu, Hui; Liu, Xiaorui; Yin, Dawei; Li, Qing

arXiv 2024;():

2024

Ref ID: 8252

Graphs play an important role in representing complex relationships in various domains like social networks, knowledge graphs, and molecular discovery. With the advent of deep learning, Graph Neural Networks (GNNs) have emerged as a cornerstone in Graph Machine Learning (Graph ML), facilitating the representation and processing of graph structures. Recently, LLMs have demonstrated unprecedented capabilities in language tasks and are widely adopted in a variety of applications such as computer vision and recommender systems. This remarkable success has also attracted interest in applying LLMs to the graph domain. Increasing efforts have been made to explore the potential of LLMs in advancing Graph ML's generalization, transferability, and few-shot learning ability. Meanwhile, graphs, especially knowledge graphs, are rich in reliable factual knowledge, which can be utilized to enhance the reasoning capabilities of LLMs and potentially alleviate their limitations such as hallucinations and the lack of explainability. Given the rapid progress of this research direction, a systematic review summarizing the latest advancements for Graph ML in the era of LLMs is necessary to provide an in-depth understanding to researchers and practitioners. Therefore, in this survey, we first review the recent developments in Graph ML. We then explore how LLMs can be utilized to enhance the quality of graph features, alleviate the reliance on labeled data, and address challenges such as graph heterogeneity and out-of-distribution (OOD) generalization. Afterward, we delve into how graphs can enhance LLMs, highlighting their abilities to enhance LLM pre-training and inference. Furthermore, we investigate various applications and discuss the potential future directions in this promising field.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3259 - Fang 2023
CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population

Fang, Tianqing; Do, Quyet V.; Zheng, Zihao; Wang, Weiqi; Choi, Sehyun; Wang, Zhaowei; Song, Yangqiu

arXiv 2023;():

2023

Ref ID: 7678

Commonsense Knowledge Bases (CSKB) Population, which aims at automatically expanding knowledge in CSKBs with external resources, is an important yet hard task in NLP. Fang et al. (2021a) proposed a CSKB Population (CKBP) framework with an evaluation set CKBP v1. However, CKBP v1 relies on crowdsourced annotations that suffer from a considerable number of mislabeled answers, and the evaluationset lacks alignment with the external knowledge source due to random sampling. In this paper, we introduce CKBP v2, a new high-quality CSKB Population evaluation set that addresses the two aforementioned issues by employing domain experts as annotators and incorporating diversified adversarial samples to make the evaluation data more representative. We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning. A better population model can also help acquire more informative commonsense knowledge as additional supervision signals for both generative commonsense inference and zero-shot commonsense question answering. Specifically, the question-answering model based on DeBERTa-v3-large (He et al., 2023b) even outperforms powerful large language models in a zero-shot setting, including ChatGPT and GPT-3.5.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1012 - Fang 2023
Automatic Knowledge Structuration of Automotive User Manual for Question Answering

Fang, Y.; Chen, Y.; Jiang, Z.; Xiao, J.; Ge, Y.

Proceedings - 2023 4th International Conference on Computer, Big Data and Artificial Intelligence, ICCBD+AI 2023 2023;():184-190

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICCBD-AI62252.2023.00038 · Ref ID: 4971

Automotive user manuals serve as repositories of valuable information pertaining to a vehicle, leveraging question answering (QA) systems provides users with a convenient means to access this knowledge. In pursuit of developing an efficient QA system for such documents, this paper proposes the organization of the content into a structured knowledge graph-like triplet format.After conducting a comprehensive analysis of the automotive user manual content, we introduce a <subject, function, content> (<s, f, c>) triplet knowledge representation to represent the knowledge. Our approach involves a three-step pipeline for extracting these triplets from semi-structured XML documents. Central to this structure is the "content"node, forming the core of knowledge items. Leveraging the in-context learning abilities of an off-the-shelf Large Language Model (LLM), specifically ChatGPT, the "subject"and "function"components are induced from the "content"node. To ensure compactness and coherence in knowledge representation, a tailored phrase normalization process is designed to select identical phrases.Additionally, a LLM-powered evaluation method is employed to validate the extracted triplets, affirming their accuracy and relevance. This methodology demonstrates the effectiveness of our proposed approach in automating the structuration of knowledge within automotive user manuals for seamless QA. © 2023 IEEE.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1160 - Fang 2022
Data-Efficient Concept Extraction from Pre-trained Language Models for Commonsense Explanation Generation

Fang, Y.; Zhang, Y.

Findings of the Association for Computational Linguistics: EMNLP 2022 2022;():5912-5922

Association for Computational Linguistics (ACL) 2022

Ref ID: 5404

Predicting the key explanation concept is essential for generating commonsense explanations. This paper introduces a method to predict the concept from pre-trained language models for commonsense explanation generation. Our experiment found that adopting a language model as the concept extractor and fine-tuning it with 20% training data can improve the quality and accuracy of the generated explanations over multiple evaluation metrics. Compared with conventional methods that search concepts over knowledge graphs, our method does not require the preparation and training models to search through knowledge graphs. To better understand the results from pre-trained language models, we also designed a metric to evaluate the retrieved concepts. Through analysis and experiments, we show the correlation between this metric and the performance of the generators, and we also show the importance of attaching concepts for generating high-quality sentences. © 2022 Association for Computational Linguistics.

Xinchen voted
Ishan voted
Final decision
What was the agreed final decision?

#762 - Färber 2023
SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples

Färber, M.; Lamprecht, D.; Krause, J.; Aung, L.; Haase, P.

22nd International Semantic Web Conference (ISWC) 2023;14266():94-112

Athens, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-47243-5_6 · Ref ID: 3289

We present SemOpenAlex, an extensive RDF knowledge graph that contains over 26 billion triples about scientific publications and their associated entities, such as authors, institutions, journals, and concepts. SemOpenAlex is licensed under CC0, providing free and open access to the data. We offer the data through multiple channels, including RDF dump files, a SPARQL endpoint, and as a data source in the Linked Open Data cloud, complete with resolvable URIs and links to other data sources. Moreover, we provide embeddings for knowledge graph entities using high-performance computing. SemOpenAlex enables a broad range of use-case scenarios, such as exploratory semantic search via our website, large-scale scientific impact quantification, and other forms of scholarly big data analytics within and across scientific disciplines. Additionally, it enables academic recommender systems, such as recommending collaborators, publications, and venues, including explainability capabilities. Finally, SemOpenAlex can serve for RDF query optimization benchmarks, creating scholarly knowledge-guided language models, and as a hub for semantic scientific publishing. Data and Services: https://semopenalex.org https://w3id.org/SemOpenAlex Code: https://github.com/metaphacts/semopenalex/ Data License: Creative Commons Zero (CC0) Code License: MIT License

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1124 - Feng 2024
Construction and Application of Knowledge Graph for Water Engineering Scheduling Based on Large Language Model

Feng, J.; Chang, Y.; Lu, J.; Tang, H.; Lyu, Z.; Qiu, Y.

J. Frontier. Comput. Sci. Technol. 2024;18(6):1637-1647

2024

DOI: 10.3778/j.issn.1673-9418.2311098 · Ref ID: 4567

With the growth of water conservancy and the increasing demand for information, handling and representing large volumes of water-related data has become complex. Particularly, scheduling textual data often exists in natural language form, lacking clear structure and standardization. Processing and utilizing such diverse data necessitates extensive domain knowledge and professional expertise. To tackle this challenge, a method based on large language model has been proposed to construct a knowledge graph for water engineering scheduling. This approach involves collecting and preprocessing scheduling rule data at the data layer, leveraging large language models to extract embedded knowledge, constructing the ontology at the conceptual layer, and extracting the“three- step”method prompt strategy at the instance layer. Under the interaction of the data, conceptual, and instance layers, high-performance extraction of rule texts is achieved, and the construction of the dataset and knowledge graph is completed. Experimental results show that the F1 value of the extraction method in this paper reaches 85.5%, and the effectiveness and rationality of the modules of the large language model are validated through ablation experiments. This graph integrates dispersed water conservancy rule information, effectively handles unstructured textual data, and offers visualization querying and functionality tracing. It aids professionals in assessing water conditions and selecting appropriate scheduling schemes, providing valuable support for conservancy decision-making and intelligent reasoning. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1548 - Feng 2024
Label design method for flood control scheduling rules assisted by LLM

Feng, J.; Lu, Z.; Fan, Z.; Kong, X.; Lu, J.; Zhou, S.

Shuili Xuebao 2024;55(8):920-930

2024

DOI: 10.13243/j.cnki.slxb.20230643 · Ref ID: 3992

The information extraction of flood control dispatching rules is of great significance for flood control dispatching automation, and the design of labeling systems is pivotal for information extraction. Traditional designs often have comprehension biases and omissions, leading to issues like overgeneralization and incompleteness. Addressing these imperfections, this research emphasizes the extraction of rules in flood scheduling texts, proposing an enhanced approach for labeling optimization.Large Language Models(LLM) are utilized for tasks like label refinement and generation, boosting label precision and clarity, and a technique for extracting entity relationship triplets is also presented for datasets with many labels. Grouping these triplets enhances extraction performance in label-rich datasets. A visual knowledge graph for flood control scheduling using Neo4j is also developed. This research offers foundational insights for future work in flood control scheduling knowledge extraction. © 2024 International Research and Training Center on Erosion and Sedimentation and China Water and Power Press. All rights reserved.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3718 - Feng 2024
Monitoring Latent World States in Language Models with Propositional Probes

Feng, Jiahai; Russell, Stuart; Steinhardt, Jacob

arXiv 2024;():

2024

Ref ID: 8431

Language models are susceptible to bias, sycophancy, backdoors, and other tendencies that lead to unfaithful responses to the input context. Interpreting internal states of language models could help monitor and correct unfaithful behavior. We hypothesize that language models represent their input contexts in a latent world model, and seek to extract this latent world state from the activations. We do so with 'propositional probes', which compositionally probe tokens for lexical information and bind them into logical propositions representing the world state. For example, given the input context ''Greg is a nurse. Laura is a physicist.'', we decode the propositions ''WorksAs(Greg, nurse)'' and ''WorksAs(Laura, physicist)'' from the model's activations. Key to this is identifying a 'binding subspace' in which bound tokens have high similarity (''Greg'' and ''nurse'') but unbound ones do not (''Greg'' and ''physicist''). We validate propositional probes in a closed-world setting with finitely many predicates and properties. Despite being trained on simple templated contexts, propositional probes generalize to contexts rewritten as short stories and translated to Spanish. Moreover, we find that in three settings where language models respond unfaithfully to the input context – prompt injections, backdoor attacks, and gender bias – the decoded propositions remain faithful. This suggests that language models often encode a faithful world model but decode it unfaithfully, which motivates the search for better interpretability tools for monitoring LMs.

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#3979 - Feng 2024
VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models

Feng, Kailai; Zhang, Yabo; Yu, Haodong; Ji, Zhilong; Bai, Jinfeng; Zhang, Hongzhi; Zuo, Wangmeng

arXiv 2024;():

2024

Ref ID: 8650

Artistic typography is a technique to visualize the meaning of input character in an imaginable and readable manner. With powerful text-to-image diffusion models, existing methods directly design the overall geometry and texture of input character, making it challenging to ensure both creativity and legibility. In this paper, we introduce a dual-branch and training-free method, namely VitaGlyph, enabling flexible artistic typography along with controllable geometry change to maintain the readability. The key insight of VitaGlyph is to treat input character as a scene composed of Subject and Surrounding, followed by rendering them under varying degrees of geometry transformation. The subject flexibly expresses the essential concept of input character, while the surrounding enriches relevant background without altering the shape. Specifically, we implement VitaGlyph through a three-phase framework: (i) Knowledge Acquisition leverages large language models to design text descriptions of subject and surrounding. (ii) Regional decomposition detects the part that most matches the subject description and divides input glyph image into subject and surrounding regions. (iii) Typography Stylization firstly refines the structure of subject region via Semantic Typography, and then separately renders the textures of Subject and Surrounding regions through Controllable Compositional Generation. Experimental results demonstrate that VitaGlyph not only achieves better artistry and readability, but also manages to depict multiple customize concepts, facilitating more creative and pleasing artistic typography generation. Our code will be made publicly at https://github.com/Carlofkl/VitaGlyph.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#404 - Feng 2023
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding

Feng, S. B.; Tan, Z. X.; Zhang, W. Q.; Lei, Z. Y.; Tsvetkov, Y.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():2116-2138

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3137

With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts-from local (e.g., sentence), to document-level, to global knowledge-to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a "context fusion" layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1127 - Feng 2024
Construction of Gem Knowledge Graph Based on Large Language Model

Feng, S.; Shi, B.; Zheng, Y.

J. Gem. Gemmol. 2024;26(3):105-112

2024

DOI: 10.15964/j.cnki.027jgg.2024.03.012 · Ref ID: 4042

The sources of gemmological knowledge include books, journals, courses, markets and related disciplines. A complete gemmological knowledge system is of great significance to the jewelry industry. Gem knowledge points are numerous and relatively isolated in storage, which is not conducive to practitioners and researchers to retrieve knowledge. This problem can be solved by constructing a gem knowledge base system. The graph can deal with the complex association between knowledge points, which is impossible for widely used structured database, therefore, a knowledge base in the form of a knowledge graph is selected. This paper introduces the traditional knowledge graph construction method and points out the difficulties: high cost, heavy workload, difficult technology and slightly low accuracy. It is proposed to use LLM (Large language model) to complete some tasks in knowledge graph construction to improve the cost and workload. A new knowledge graph construction idea based on LLM is conceived. Its steps include data cleaning, knowledge acquisition and knowledge refinement. According to the above ideas, a gemstone knowledge graph that can cover the gemstone knowledge of the bachelor stage is constructed, and some query scenarios are displayed. The feasibility and high efficiency of the new method are proved by our test e-valuation, and the possible application direction of the graph is prospected. © 2024 Editorial Department of Journal of Gems and Gemmology, China University of Geosciences. All rights reserved.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1211 - Feng 2024
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

Feng, S.; Shi, W.; Wang, Y.; Ding, W.; Balachandran, V.; Tsvetkov, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():14664-14690

Association for Computational Linguistics (ACL) 2024

Ref ID: 4293

Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps-missing or outdated information in LLMs-might always persist given the evolving nature of knowledge. In this work, we study approaches to identify LLM knowledge gaps and abstain from answering questions when knowledge gaps are present. We first adapt existing approaches to model calibration or adaptation through fine-tuning/prompting and analyze their ability to abstain from generating low-confidence outputs. Motivated by their failures in self-reflection and over-reliance on held-out sets, we propose two novel approaches that are based on model collaboration, i.e., LLMs probing other LLMs for knowledge gaps, either cooperatively or competitively. Extensive experiments with three LLMs on four QA tasks featuring diverse knowledge domains demonstrate that both cooperative and competitive approaches to unveiling LLM knowledge gaps achieve up to 19.3% improvements on abstain accuracy against the strongest baseline. Further analysis reveals that our abstention methods pinpoint failure cases in retrieval augmentation and knowledge gaps in multi-hop reasoning. © 2024 Association for Computational Linguistics.

Xinchen voted
Davis voted
Final decision
What was the agreed final decision?

#3906 - Feng 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback

Feng, Shangbin; Shi, Weijia; Wang, Yike; Ding, Wenxuan; Ahia, Orevaoghene; Li, Shuyue Stella; Balachandran, Vidhisha; Sitaram, Sunayana; Tsvetkov, Yulia

arXiv 2024;():

2024

Ref ID: 8414

Multilingual LLMs often have knowledge disparities across languages, with larger gaps in under-resourced languages. Teaching LLMs to abstain in the face of knowledge gaps is thus a promising strategy to mitigate hallucinations in multilingual settings. However, previous studies on LLM abstention primarily focus on English; we find that directly applying existing solutions beyond English results in up to 20.5% performance gaps between high and low-resource languages, potentially due to LLMs' drop in calibration and reasoning beyond a few resource-rich languages. To this end, we propose strategies to enhance LLM abstention by learning from multilingual feedback, where LLMs self-reflect on proposed answers in one language by generating multiple feedback items in related languages: we show that this helps identifying the knowledge gaps across diverse languages, cultures, and communities. Extensive experiments demonstrate that our multilingual feedback approach outperforms various strong baselines, achieving up to 9.2% improvement for low-resource languages across three black-box and open models on three datasets, featuring open-book, closed-book, and commonsense QA. Further analysis reveals that multilingual feedback is both an effective and a more equitable abstain strategy to serve diverse language speakers, and cultural factors have great impact on language selection and LLM abstention behavior, highlighting future directions for multilingual and multi-cultural reliable language modeling.

Xinchen voted
Davis voted
Final decision
What was the agreed final decision?

#170 - Feng 2023
Detecting contradictions from IoT protocol specification documents based on neural generated knowledge graph

Feng, X. G.; Zhang, Y. J.; Meng, M. H.; Li, Y. S.; Joe, C. E.; Wang, Z.; Bai, G. D.

ISA Trans. 2023;141():10-19

2023

DOI: 10.1016/j.isatra.2023.04.025 · Ref ID: 3075

Due to the boom of Internet of Things (IoT) in recent years, various IoT devices are connected to the Internet and communicate with each other through network protocols such as the Constrained Application Protocol (CoAP). These protocols are typically defined and described in specification documents, such as Request for Comments (RFC), which are written in natural or semi-formal languages. Since developers largely follow the specification documents when implementing web protocols, they have become the de facto protocol specifications. Therefore, it must be ensured that the descriptions in them are consistent to avoid technological issues, incompatibility, security risks, or even legal concerns. In this work, we propose Neural RFC Knowledge Graph (NRFCKG), a neural network-generated knowledge graph based contradictions detection tool for IoT protocol specification documents. Our approach can automatically parse the specification documents and construct knowledge graphs from them through entity extraction, relation extraction, and rule extraction with large language models. It then conducts an intra-entity and inter-entity contradiction detection over the generated knowledge graph. We implement NRFCKG and apply it to the most extensively used messaging protocols in IoT, including the main RFC (RFC7252) of CoAP, the specification document of MQTT, and the specification document of AMQP. Our evaluation shows that NRFCKG generalizes well to other specification documents and it manages to detect contradictions from these IoT protocol specification documents.(c) 2023 ISA. Published by Elsevier Ltd. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#605 - Fernando 2020
Neural memory plasticity for medical anomaly detection

Fernando, T.; Denman, S.; Ahmedt-Aristizabal, D.; Sridharan, S.; Laurens, K. R.; Johnston, P.; Fookes, C.

Neural Netw. 2020;127():67-81

2020

DOI: 10.1016/j.neunet.2020.04.011 · Ref ID: 3728

In the domain of machine learning, Neural Memory Networks (NMNs) have recently achieved impressive results in a variety of application areas including visual question answering, trajectory prediction, object tracking, and language modelling. However, we observe that the attention based knowledge retrieval mechanisms used in current NMNs restrict them from achieving their full potential as the attention process retrieves information based on a set of static connection weights. This is suboptimal in a setting where there are vast differences among samples in the data domain; such as anomaly detection where there is no consistent criteria for what constitutes an anomaly. In this paper, we propose a plastic neural memory access mechanism which exploits both static and dynamic connection weights in the memory read, write and output generation procedures. We demonstrate the effectiveness and flexibility of the proposed memory model in three challenging anomaly detection tasks in the medical domain: abnormal EEG identification, MRI tumour type classification and schizophrenia risk detection in children. In all settings, the proposed approach outperforms the current state-of-the-art. Furthermore, we perform an in-depth analysis demonstrating the utility of neural plasticity for the knowledge retrieval process and provide evidence on how the proposed memory model generates sparse yet informative memory outputs. (C) 2020 Elsevier Ltd. All rights reserved.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#3528 - Forer 2024
Inferring Scientific Cross-Document Coreference and Hierarchy with Definition-Augmented Relational Reasoning

Forer, Lior; Hope, Tom

arXiv 2024;():

2024

Ref ID: 8622

We address the fundamental task of inferring cross-document coreference and hierarchy in scientific texts, which has important applications in knowledge graph construction, search, recommendation and discovery. LLMs can struggle when faced with many long-tail technical concepts with nuanced variations. We present a novel method which generates context-dependent definitions of concept mentions by retrieving full-text literature, and uses the definitions to enhance detection of cross-document relations. We further generate relational definitions, which describe how two concept mentions are related or different, and design an efficient re-ranking approach to address the combinatorial explosion involved in inferring links across papers. In both fine-tuning and in-context learning settings we achieve large gains in performance. We provide analysis of generated definitions, shedding light on the relational reasoning ability of LLMs over fine-grained scientific concepts.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2797 - Francis 2017
Poster Abstract: Context Intelligence in Pervasive Environments

Francis, J.; Oltramari, A.; Munir, S.; Shelton, C.; Rowe, A.

2017 IEEE/ACM Second International Conference on Internet-of-Things Design and Implementation (IoTDI) 2017;():315-316

2017

Ref ID: 6443

Intelligent personalization systems are becoming increasingly reliant on contextually-relevant devices and services, such as those available within modern IoT deployments. An IoT context may emerge-or become pervasive-when the intelligent system generates knowledge from dialogue-based interactions with the end-user; the context is strengthened even further by incorporating state representations about the environment (e.g., generated from wireless sensor data) into the knowledge graph. This is crucial for pervasive applications like digital assistance in IoT, where context-aware systems need to adapt quickly: activities like leaving work home-bound, driving to the grocery store, arriving at home, and walking the dog, for example, can occur in a relatively short period of time-during which an intelligent assistant must be able to support user requests in a consistent and coherent manner. Given that computational ontologies can serve as semantic models for heterogeneous data, they are becoming increasingly viable for reasoning across different IoT contexts. This involves: (a) federation and dynamic pruning of multiple modular ontologies, ideally, to comprehensively capture only the knowledge that will facilitate execution of a multi-context task; (b) fast consistency-checking and ontology-based inferences, aided by rules-based execution environments that can evaluate/transform ambient wireless sensor network (WSN) data, in real-time; and (c) run-time execution of ontology-based control procedures, through rule-engine actuation commands sent across the WSN. Only by realizing these functionalities may intelligent systems be capable of reasoning over device properties, system states, and user activities, while appropriately delegating commands to other intelligent agents or other relevant IoT services. In this poster, we illustrate how a multi-context knowledge base can be structured on the basis of modular ontologies and integrated with a distributed rules-based inference engine in multiple smart-building environments, in order to enable scalable contextual reasoning for intelligent assistance. Preliminary results are also discussed. This work is conducted through the partnership of Bosch Research Pittsburgh and Carnegie Mellon University (CMU), and is in partial satisfaction of CMU's Bosch Energy Research Network (BERN) grant, awarded for developments in intelligent building solutions. The approach we describe is also partially based on the Ubiquitous Personal Assistant (UPA) project, Bosch Research's largest research initiative worldwide.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1808 - Fu 2024
Research on KG and LLM knowledge-enhanced pediatric diseases intelligent diagnosis

Fu, W.; Dai, D.; Zhang, K.; Liu, X.; Zhang, H.; Ao, L.; Xiao, J.

Proceedings of SPIE - The International Society for Optical Engineering 2024;13171():

SPIE 2024

DOI: 10.1117/12.3032061 · Ref ID: 4534

Pediatric diseases are challenging to diagnose due to their complex and diverse characteristics. To assist doctors in diagnosis and help them make informed decisions, this paper proposes a Knowledge graph and Large language model Knowledge-Enhanced (KLKE) intelligent diagnosis model. The intelligent diagnosis task is treated as a text classification task, where the original Electronic Medical Record are input into MacBERT model encoder to obtain the contextual representation after key information enhancement and KG prompted LLM enhancement respectively. The final text representation is obtained by concatenating and merging the enhanced representations. Graph Convolutional Network is utilized to obtain the knowledge representation and the two representations are fused using a fusion method based on interactive attention mechanism. Experiments are conducted on PeEMR, and compared with models that only fuses triples and graph structures. The KLKE achieved an increase of 9.15% and 2.28% in F1_micro scores respectively. © 2024 SPIE.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#889 - Fukuda 2024
Zero-Shot Query Experiments in Knowledge Graph Reasoning Challenge for Older Adults Safety

Fukuda, K.; Ugai, T.; Egami, S.; Matsushita, K.; Ieee

18th IEEE International Conference on Semantic Computing (ICSC) 2024;():301-305

Laguna Hills, CA Ieee Computer Soc 2024

DOI: 10.1109/icsc59802.2024.00054 · Ref ID: 3149

The 2nd International Knowledge Graph Reasoning Challenge involves social issues focusing on the safety of older adults in their homes. The challenge aims to extract statistical information related to actions and objects that pose risks to daily life. To answer each question in a video, we used Video-LLaVa, a large-scale visual language model (LVLM), using two approaches. The first approach involves inputting question text and video into Video-LLaVa. In this paper, we describe the results of zero-shot queries. The second approach is to obtain a detailed description of the video output using Video-LLaVa and then answer questions based on it. We have yet to achieve good results with these approaches, but we have identified some issues that we will discuss along with the results.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1179 - Furumai 2023
Detecting Dialogue Hallucination Using Graph Neural Networks

Furumai, K.; Wang, Y.; Shinohara, M.; Ikeda, K.; Yu, Y.; Kato, T.

Proceedings - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023 2023;():871-877

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICMLA58977.2023.00128 · Ref ID: 4945

Even though large language models (LLMs) accumulate tremendous knowledge, dialogue systems built with LLMs induce hallucinations, leading to the generation of non-factual responses. How to provide proper references to achieve interpretable hallucination detection is a key issue that needs to be addressed. In this paper, we propose a graph neural network (GNN)-based method to achieve high-performance and interpretable hallucination detection for domain-specific dialogue systems. The method involves performing graph matching between a reference knowledge graph obtained from a knowledge database and a response knowledge graph extracted from the response to detect non-factual responses. By comparing with strong baselines, our method achieves a recall improvement of up to 11% and infers the cause of hallucinations with a probability of over 79%. © 2023 IEEE.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#2765 - Gallo 2007
An Ontology for the Quality of Experience framework

Gallo, E.; Siller, M.; Woods, J.

2007 IEEE International Conference on Systems, Man and Cybernetics 2007;():1540-1544

2007

DOI: 10.1109/ICSMC.2007.4414109 · Ref ID: 6668

Agents need a formal representation of knowledge. This is modelled in an ontology. We present a survey of ontologies in the area of QoS management. From the survey we identified that improvements can be done in the Ontology of the Quality of Experience framework. We believe that with the extension full QoS Management capabilities can be then supported in the context of Quality of Experience. We focus in the appropriate QoS mechanism selection, network monitoring and QoS adaptation. With the additional concepts and actions the QoE ontology meets three key requirements for a QoS ontology: (i) decide which QoS mechanisms is better to fits the user needs; (ii) perform QoS monitoring and detection of SLA violations; and (iii) carry out QoS adaptation. Two experimental scenarios are currently being conducted. In scenario 1 the original ontology is used whilst for scenario 2 the extended version is employed. An initial comparative analysis is performed.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3696 - Gan 2023
Making Large Language Models Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation

Gan, Chunjing; Yang, Dan; Hu, Binbin; Liu, Ziqi; Shen, Yue; Zhang, Zhiqiang; Gu, Jinjie; Zhou, Jun; Zhang, Guannan

arXiv 2023;():

2023

Ref ID: 7978

Nowadays, the rapid development of mobile economy has promoted the flourishing of online marketing campaigns, whose success greatly hinges on the efficient matching between user preferences and desired marketing campaigns where a well-established Marketing-oriented Knowledge Graph (dubbed as MoKG) could serve as the critical "bridge" for preference propagation. In this paper, we seek to carefully prompt a Large Language Model (LLM) with domain-level knowledge as a better marketing-oriented knowledge miner for marketing-oriented knowledge graph construction, which is however non-trivial, suffering from several inevitable issues in real-world marketing scenarios, i.e., uncontrollable relation generation of LLMs,insufficient prompting ability of a single prompt, the unaffordable deployment cost of LLMs. To this end, we propose PAIR, a novel Progressive prompting Augmented mIning fRamework for harvesting marketing-oriented knowledge graph with LLMs. In particular, we reduce the pure relation generation to an LLM based adaptive relation filtering process through the knowledge-empowered prompting technique. Next, we steer LLMs for entity expansion with progressive prompting augmentation,followed by a reliable aggregation with comprehensive consideration of both self-consistency and semantic relatedness. In terms of online serving, we specialize in a small and white-box PAIR (i.e.,LightPAIR),which is fine-tuned with a high-quality corpus provided by a strong teacher-LLM. Extensive experiments and practical applications in audience targeting verify the effectiveness of the proposed (Light)PAIR.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#162 - Gao 2024
Deep Learning-Based Fault Knowledge Graph Construction for Power Communication Networks

Gao, D. Q.; Zhu, P. Y.; Wang, S.; Zhao, Z. Y.; Ieee

6th Asia Energy and Electrical Engineering Symposium (AEEES) 2024;():1088-1093

Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu, PEOPLES R CHINA Ieee 2024

DOI: 10.1109/aeees61147.2024.10544941 · Ref ID: 3028

Power communication network is a crucial infrastructure in the model power system, and its maintenance capability are crucial to ensuring the stable operation of power grid business. As an organized semantic knowledge base, the knowledge graph effectively organizes power communication network fault documentation and expert experience to enhance intelligent maintenance. This paper outlines a top-down approach to systematically construct a fault knowledge graph in the domain of power communication networks. The approach utilizes a seven-step method to establish a domain ontology model and integrates deep learning algorithms, including pre-trained language models, bidirectional long short time memory networks, convolutional neural networks and attention mechanisms. These algorithms process unstructured text to extract key entities and relationships. The effectiveness of the approach is verified through experiments using a product device document as a test case. Extracted knowledge is then visualized and stored using Neo4j database. Finally, this paper proposes a knowledge service model centered on fault knowledge graph and explores its application in fault diagnosis.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3576 - Gao 2022
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models

Gao, Daniel; Jia, Yantao; Li, Lei; Fu, Chengzhen; Dou, Zhicheng; Jiang, Hao; Zhang, Xinyu; Chen, Lei; Cao, Zhao

arXiv 2022;():

2022

Ref ID: 7522

Previous works show the great potential of pre-trained language models (PLMs) for storing a large amount of factual knowledge. However, to figure out whether PLMs can be reliable knowledge sources and used as alternative knowledge bases (KBs), we need to further explore some critical features of PLMs. Firstly, knowledge memorization and identification abilities: traditional KBs can store various types of entities and relationships; do PLMs have a high knowledge capacity to store different types of knowledge? Secondly, reasoning ability: a qualified knowledge source should not only provide a collection of facts, but support a symbolic reasoner. Can PLMs derive new knowledge based on the correlations between facts? To evaluate these features of PLMs, we propose a benchmark, named Knowledge Memorization, Identification, and Reasoning test (KMIR). KMIR covers 3 types of knowledge, including general knowledge, domain-specific knowledge, and commonsense, and provides 184,348 well-designed questions. Preliminary experiments with various representative pre-training language models on KMIR reveal many interesting phenomenons: 1) The memorization ability of PLMs depends more on the number of parameters than training schemes. 2) Current PLMs are struggling to robustly remember the facts. 3) Model compression technology retains the amount of knowledge well, but hurts the identification and reasoning abilities. We hope KMIR can facilitate the design of PLMs as better knowledge sources.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1350 - Gao 2024
Generative News Recommendation

Gao, S.; Fang, J.; Tu, Q.; Yao, Z.; Chen, Z.; Ren, P.; Ren, Z.

WWW 2024 - Proceedings of the ACM Web Conference 2024;():3444-3453

Association for Computing Machinery, Inc 2024

DOI: 10.1145/3589334.3645448 · Ref ID: 4060

Most existing news recommendation methods tackle this task by conducting semantic matching between candidate news and user representation produced by historical clicked news. However, they overlook the high-level connections among different news articles and also ignore the profound relationship between these news articles and users. And the definition of these methods dictates that they can only deliver news articles as-is. On the contrary, integrating several relevant news articles into a coherent narrative would assist users in gaining a quicker and more comprehensive understanding of events. In this paper, we propose a novel generative news recommendation paradigm that includes two steps: (1) Leveraging the internal knowledge and reasoning capabilities of the Large Language Model (LLM) to perform high-level matching between candidate news and user representation; (2) Generating a coherent and logically structured narrative based on the associations between related news and user interests, thus engaging users in further reading of the news. Specifically, we propose GNR to implement the generative news recommendation paradigm. First, we compose the dual-level representation of news and users by leveraging LLM to generate theme-level representations and combine them with semantic-level representations. Next, in order to generate a coherent narrative, we explore the news relation and filter the related news according to the user preference. Finally, we propose a novel training method named UIFT to train the LLM to fuse multiple news articles in a coherent narrative. Extensive experiments show that GNR can improve recommendation accuracy and eventually generate more personalized and factually consistent narratives. © 2024 Owner/Author.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#421 - Gao 2024
Knowledge Enhanced Vision and Language Model for Multi-Modal Fake News Detection

Gao, X. Y.; Wang, X.; Chen, Z. Y.; Zhou, W.; Hoi, S. C. H.

IEEE Trans. Multimedia 2024;26():8312-8322

2024

DOI: 10.1109/tmm.2023.3330296 · Ref ID: 3200

The rapid dissemination of fake news and rumors through the Internet and social media platforms poses significant challenges and raises concerns in the public sphere. Automatic detection of fake news plays a crucial role in mitigating the spread of misinformation. While recent approaches have focused on leveraging neural networks to improve textual and visual representations in multi-modal fake news analysis, they often overlook the potential of incorporating knowledge information to verify facts within news articles. In this paper, we present a vision and language model that incorporates knowledge to enhance multi-modal fake news detection. Our proposed model integrates information from large scale open knowledge graphs to augment its ability to discern the veracity of news content. Unlike previous methods that utilize separate models to extract textual and visual features, we synthesize a unified model capable of extracting both types of features simultaneously. To represent news articles, we introduce a graph structure where nodes encompass entities, relationships extracted from the textual content, and objects depicted in associated images. By utilizing the knowledge graph, we establish meaningful relationships between nodes within the news articles. Experimental evaluations on a real-world multi-modal dataset from Twitter demonstrate significant performance improvement by incorporating knowledge information.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3650 - Gao 2023
Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction

Gao, Yanjun; Li, Ruizhe; Caskey, John; Dligach, Dmitriy; Miller, Timothy; Churpek, Matthew M.; Afshar, Majid

arXiv 2023;():

2023

Ref ID: 7820

Electronic Health Records (EHRs) and routine documentation practices play a vital role in patients' daily care, providing a holistic record of health, diagnoses, and treatment. However, complex and verbose EHR narratives overload healthcare providers, risking diagnostic inaccuracies. While Large Language Models (LLMs) have showcased their potential in diverse language tasks, their application in the healthcare arena needs to ensure the minimization of diagnostic errors and the prevention of patient harm. In this paper, we outline an innovative approach for augmenting the proficiency of LLMs in the realm of automated diagnosis generation, achieved through the incorporation of a medical knowledge graph (KG) and a novel graph model: Dr.Knows, inspired by the clinical diagnostic reasoning process. We derive the KG from the National Library of Medicine's Unified Medical Language System (UMLS), a robust repository of biomedical knowledge. Our method negates the need for pre-training and instead leverages the KG as an auxiliary instrument aiding in the interpretation and summarization of complex medical concepts. Using real-world hospital datasets, our experimental results demonstrate that the proposed approach of combining LLMs with KG has the potential to improve the accuracy of automated diagnosis generation. More importantly, our approach offers an explainable diagnostic pathway, edging us closer to the realization of AI-augmented diagnostic decision support systems.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1077 - Garbas 2024
Choose Your Transformer: Improved Transferability Estimation of Transformer Models on Classification Tasks

Garbas, L.; Ploner, M.; Akbik, A.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():12752-12768

Association for Computational Linguistics (ACL) 2024

Ref ID: 4303

There currently exists a multitude of pre-trained transformer language models (LMs) that are readily available. From a practical perspective, this raises the question of which pre-trained LM will perform best if fine-tuned for a specific downstream NLP task. However, exhaustively fine-tuning all available LMs to determine the best-fitting model is computationally infeasible. To address this problem, we present an approach that inexpensively estimates a ranking of the expected performance of a given set of candidate LMs for a given task. Following a layer-wise representation analysis, we extend existing approaches such as H-score and LogME by aggregating representations across all layers of the transformer model. We present an extensive analysis of 20 transformer LMs, 6 downstream NLP tasks, and various estimators (linear probing, kNN, H-score, and LogME). Our evaluation finds that averaging the layer representations significantly improves the Pearson correlation coefficient between the true model ranks and the estimate, increasing from 0.58 to 0.86 for LogME and from 0.65 to 0.88 for H-score. © 2024 Association for Computational Linguistics.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#2515 - García-García 2021
gPROFIT: A Tool to Assist the Automatic Extraction of Business Knowledge From Legacy Information Systems

García-García, J. A.; Maldonado, C. A.; Meidan, A.; Morillo-Baro, E.; Escalona, M. J.

IEEE Access 2021;9():94934-94952

2021

DOI: 10.1109/ACCESS.2021.3093356 · Ref ID: 6547

Business digitization is a crucial strategy for business growth in the 21st century. Its benefits include improving business process automation, customer satisfaction, productivity, decision-making, turnover, and adaptation to market changes. However, digitization is not a trivial task. As a major paradigm and mindset shift, it involves a lot of effort within an organization and therefore requires commitment from employees and managers. This is especially critical in companies whose business processes are mostly reliant on legacy information systems (LIS), which are usually specialized and based on technological architectures that could be considered obsolete. The replacement of these systems by more recent, process-oriented technologies, the building up of employees' know-how and the continued use of outdated documentation are difficult, expensive tasks that hinder the initiation of continuous improvement processes in companies. This paper proposes techniques for finding and extracting process models from legacy databases. Specifically, it ( i) lays the theoretical foundations of a model-driven framework for systematically extracting business process models (conform to standard BPMN notation) from LIS considering process time perspective, and (ii) proposes a technological tool called gPROFIT, which uses machine learning techniques to support that theoretical framework, facilitate its use in real environments and extract the business knowledge embedded in such legacy systems. The paper also presents proofs-of-concept showing how our proposal has been validated in several legacy systems.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#547 - Garrido-Muñoz 2023
MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish

Garrido-Muñoz, I.; Martínez-Santiago, F.; Montejo-Ráez, A.

Lang. Resour. Eval. 2023;():31

2023

DOI: 10.1007/s10579-023-09670-3 · Ref ID: 3258

The study of bias in language models is a growing area of work, however, both research and resources are focused on English. In this paper, we make a first approach focusing on gender bias in some freely available Spanish language models trained using popular deep neural networks, like BERT or RoBERTa. Some of these models are known for achieving state-of-the-art results on downstream tasks. These promising results have promoted such models' integration in many real-world applications and production environments, which could be detrimental to people affected for those systems. This work proposes an evaluation framework to identify gender bias in masked language models, with explainability in mind to ease the interpretation of the evaluation results. We have evaluated 20 different models for Spanish, including some of the most popular pretrained ones in the research community. Our findings state that varying levels of gender bias are present across these models.This approach compares the adjectives proposed by the model for a set of templates. We classify the given adjectives into understandable categories and compute two new metrics from model predictions, one based on the internal state (probability) and the other one on the external state (rank). Those metrics are used to reveal biased models according to the given categories and quantify the degree of bias of the models under study.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3986 - Ge 2024
What Do the Circuits Mean? A Knowledge Edit View

Ge, Huaizhi; Rudzicz, Frank; Zhu, Zining

arXiv 2024;():

2024

Ref ID: 8421

In the field of language model interpretability, circuit discovery is gaining popularity. Despite this, the true meaning of these circuits remains largely unanswered. We introduce a novel method to learn their meanings as a holistic object through the lens of knowledge editing. We extract circuits in the GPT-2 base model for classification tasks related to syntax and model safety, and study their knowledge property via a model edit dataset containing hierarchical entities. We find that these circuits contain entity knowledge but resist new knowledge, demonstrating a "confirmation bias" behavior. Additionally, we examine the impact of circuit size, discovering that an ideal "theoretical circuit" where essential knowledge is concentrated likely incorporates more than 5% but less than 50% of the model's parameters. We also assess the overlap between circuits from different datasets, finding moderate similarities. We proceed with analyzing the modular components of the circuits, finding that up to 60% of the circuits consist of layer normalization modules rather than attention or MLP modules, adding evidence to the ongoing debates regarding knowledge localization. In summary, our findings offer novel insights into the meanings of the circuits, and introduce directions for further interpretability and safety research of language models.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#435 - Ge 2024
Knowledge Graph Embedding: An Overview

Ge, X.; Wang, Y. C.; Wang, B.; Kuo, C. C. J.

APSIPA Trans. Signal Inf. Proc. 2024;13(1):51

2024

DOI: 10.1561/116.00000065 · Ref ID: 3216

Many mathematical models have been leveraged to design embeddings for representing Knowledge Graph (KG) entities and relations for link prediction and many downstream tasks. These mathematically-inspired models are not only highly scalable for inference in large KGs, but also have many explainable advantages in modeling different relation patterns that can be validated through both formal proofs and empirical results. In this paper, we make a comprehensive overview of the current state of research in KG completion. In particular, we focus on two main branches of KG embedding (KGE) design: 1) distance-based methods and 2) semantic matching-based methods. We discover the connections between recently proposed models and present an underlying trend that might help researchers invent novel and more effective models. Next, we delve into CompoundE and CompoundE3D, which draw inspiration from 2D and 3D affine operations, respectively. They encompass a broad spectrum of distance-based embedding techniques. We will also discuss an emerging approach for KG completion which leverages pre-trained language models (PLMs) and textual descriptions of entities and relations and offer insights into the integration of KGE embedding methods with PLMs for KG completion.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#2390 - Ge 2024
Enhancing Pre-Trained Language Models with Knowledge Representation Using Line Graphs

Ge, Z.; Zhu, Y.; Pan, R.

2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT) 2024;():1-9

2024

DOI: 10.1109/AICIT62434.2024.10730175 · Ref ID: 7022

To address the inherent limitation of pre-trained language models regarding factual knowledge, current efforts encompass a variety of methods aimed at bolstering their capabilities through the integration of knowledge graphs as external sources. This augmentation seeks to enhance their performance across knowledge-driven tasks. However, the challenges of effectively encapsulating entity knowledge and mitigating the storage overhead associated with external knowledge persist. In this paper, we present a novel approach for representing entity knowledge. Our method leverages the relational context surrounding entities, departing from the conventional practice of employing distinct vector representations for each entity. Specifically, we propose a transformation of entity-level subgraphs into line graphs, allowing us to explicitly capture and model relational patterns inherent in entity adjacencies. In contrast to the original graph-based representation, our line graph-based model exhibits a heightened capacity to capture intricate knowledge structures. Through empirical evaluation across three downstream tasks - namely, relation extraction, entity typing, and question answering over knowledge graphs - we substantiate the efficacy of our approach. The experimental results demonstrate the superior performance of our model over prevailing state-of-the-art methodologies across the majority of tasks.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3168 - Ge 2024
WorldGPT: Empowering LLM as Multimodal World Model

Ge, Zhiqi; Huang, Hongzhe; Zhou, Mingze; Li, Juncheng; Wang, Guoming; Tang, Siliang; Zhuang, Yueting

Proceedings of the 32nd ACM International Conference on Multimedia 2024;():7346–7355

Melbourne VIC, Australia Association for Computing Machinery 2024

DOI: 10.1145/3664647.3681488 · Ref ID: 7203

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3321 - Gema 2024
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

Gema, Aryo Pradipta; Jin, Chen; Abdulaal, Ahmed; Diethe, Tom; Teare, Philip; Alex, Beatrice; Minervini, Pasquale; Saseendran, Amrutha

arXiv 2024;():

2024

Ref ID: 8749

Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads within the Transformer architecture, known as retrieval heads, responsible for extracting relevant contextual information. We hypothesise that masking these retrieval heads can induce hallucinations and that contrasting the outputs of the base LLM and the masked LLM can reduce hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads (DeCoRe), a novel training-free decoding strategy that amplifies information found in the context and model parameters. DeCoRe mitigates potentially hallucinated responses by dynamically contrasting the outputs of the base LLM and the masked LLM, using conditional entropy as a guide. Our extensive experiments confirm that DeCoRe significantly improves performance on tasks requiring high contextual faithfulness, such as summarisation (XSum by 18.6%), instruction following (MemoTrap by 10.9%), and open-book question answering (NQ-Open by 2.4% and NQ-Swap by 5.5%).

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#234 - Gerritse 2022
Entity-aware Transformers for Entity Search

Gerritse, E. J.; Hasibi, F.; de Vries, A. P.; Acm

45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2022;():1455-1465

Madrid, SPAIN Assoc Computing Machinery 2022

DOI: 10.1145/3477495.3531971 · Ref ID: 3290

Pre-trained language models such as BERT have been a key ingredient to achieve state-of-the-art results on a variety of tasks in natural language processing and, more recently, also in information retrieval. Recent research even claims that BERT is able to capture factual knowledge about entity relations and properties, the information that is commonly obtained from knowledge graphs. This paper investigates the following question: Do BERT-based entity retrieval models benefit from additional entity information stored in knowledge graphs? To address this research question, we map entity embeddings into the same input space as a pre-trained BERT model and inject these entity embeddings into the BERT model. This entity-enriched language model is then employed on the entity retrieval task. We show that the entity-enriched BERT model improves effectiveness on entity-oriented queries over a regular BERT model, establishing a new state-of-the-art result for the entity retrieval task, with substantial improvements for complex natural language queries and queries requesting a list of entities with a certain property. Additionally, we show that the entity information provided by our entity-enriched model particularly helps queries related to less popular entities. Last, we observe empirically that the entity-enriched BERT models enable fine-tuning on limited training data, which otherwise would not be feasible due to the known instabilities of BERT in few-sample fine-tuning, thereby contributing to data-efficient training of BERT for entity search.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1322 - Ghanem 2024
Fine-Tuning vs. Prompting: Evaluating the Knowledge Graph Construction with LLMs

Ghanem, H.; Cruz, C.

CEUR Workshop Proceedings 2024;3747():18

CEUR-WS 2024

Ref ID: 4334

This paper explores Text-to-Knowledge Graph (T2KG) construction„ assessing Zero-Shot Prompting (ZSP), Few-Shot Prompting (FSP), and Fine-Tuning (FT) methods with Large Language Models (LLMs). Through comprehensive experimentation with Llama2, Mistral, and Starling, we highlight the strengths of FT, emphasize dataset size’s role, and introduce nuanced evaluation metrics. Promising perspectives include synonym-aware metric refinement, and data augmentation with LLMs. The study contributes valuable insights to KG construction methodologies, setting the stage for further advancements. © 2024 Copyright for this paper by its authors.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1592 - Ghassabi 2023
Leveraging Knowledge Graphs for Matching Heterogeneous Entities and Explanation

Ghassabi, S.; Behkamal, B.; Milani, M.

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():2910-2919

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BigData59044.2023.10386157 · Ref ID: 4909

Entity matching (EM), also known as record linkage, is crucial in data integration, cleaning, and knowledge base construction. Modern matching techniques leverage deep learning and pre-trained language models (PLMs) to effectively identify matching records, showcasing significant advancements over traditional methods. However, certain critical matching aspects have received limited attention in these techniques. They heavily rely on PLMs' encodings and face challenges in integrating external sources of knowledge to enhance matching accuracy. Additionally, these techniques often lack transparency, impeding users' understanding of the underlying rationale for matching decisions. Furthermore, they exhibit limitations and decreased performance in handling heterogeneous records from datasets with diverse schemas. This paper presents EXKG, a novel technique that addresses these challenges and effectively matches heterogeneous records with varying attributes. EXKG combines the power of knowledge graphs (KGs) and PLMs to perform record linkage while offering explanatory insights into the matching results. We demonstrate that EXKG achieves competitive performance through experimental studies compared to state-of-the-art matching techniques. As a by-product, our solution generates explanations that give end users a comprehensive understanding of the matching process. We evaluate the quality of these explanations by using a user study and show they empower end users to make informed decisions © 2023 IEEE.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#41 - Gilbert 2024
Augmented non-hallucinating large language models as medical information curators

Gilbert, S.; Kather, J. N.; Hogan, A.

npj Digit. Med. 2024;7(1):5

2024

DOI: 10.1038/s41746-024-01081-0 · Ref ID: 3223

Reliably processing and interlinking medical information has been recognized as a critical foundation to the digital transformation of medical workflows, and despite the development of medical ontologies, the optimization of these has been a major bottleneck to digital medicine. The advent of large language models has brought great excitement, and maybe a solution to the medicines' 'communication problem' is in sight, but how can the known weaknesses of these models, such as hallucination and non-determinism, be tempered? Retrieval Augmented Generation, particularly through knowledge graphs, is an automated approach that can deliver structured reasoning and a model of truth alongside LLMs, relevant to information structuring and therefore also to decision support.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2464 - Gómez-Pérez 2013
A Formalism and Method for Representing and Reasoning with Process Models Authored by Subject Matter Experts

Gómez-Pérez, J. M.; Erdmann, M.; Greaves, M.; Corcho, O.

IEEE Transactions on Knowledge and Data Engineering 2013;25(9):1933-1945

2013

DOI: 10.1109/TKDE.2012.127 · Ref ID: 6019

Enabling Subject Matter Experts (SMEs) to formulate knowledge without the intervention of Knowledge Engineers (KEs) requires providing SMEs with methods and tools that abstract the underlying knowledge representation and allow them to focus on modeling activities. Bridging the gap between SME-authored models and their representation is challenging, especially in the case of complex knowledge types like processes, where aspects like frame management, data, and control flow need to be addressed. In this paper, we describe how SME-authored process models can be provided with an operational semantics and grounded in a knowledge representation language like F-logic to support process-related reasoning. The main results of this work include a formalism for process representation and a mechanism for automatically translating process diagrams into executable code following such formalism. From all the process models authored by SMEs during evaluation 82 percent were well formed, all of which executed correctly. Additionally, the two optimizations applied to the code generation mechanism produced a performance improvement at reasoning time of 25 and 30 percent with respect to the base case, respectively.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#828 - Gong 2020
Towards Knowledge Enhanced Language Model for Machine Reading Comprehension

Gong, P. Z.; Liu, J.; Yang, Y. H.; He, H. H.

IEEE Access 2020;8():224837-224851

2020

DOI: 10.1109/access.2020.3044308 · Ref ID: 3556

Machine reading comprehension is a crucial and challenging task in natural language processing (NLP). Recently, knowledge graph (KG) embedding has gained massive attention as it can effectively provide side information for downstream tasks. However, most previous knowledge-based models do not take into account the structural characteristics of the triples in KGs, and only convert them into vector representations for direct accumulation, leading to deficiencies in knowledge extraction and knowledge fusion. In order to alleviate this problem, we propose a novel deep model KCF-NET, which incorporates knowledge graph representations with context as the basis for predicting answers by leveraging capsule network to encode the intrinsic spatial relationship in triples of KG. In KCF-NET, we fine-tune BERT, a highly performance contextual language representation model, to capture complex linguistic phenomena. Besides, a novel fusion structure based on multi-head attention mechanism is designed to balance the weight of knowledge and context. To evaluate the knowledge expression and reading comprehension ability of our model, we conducted extensive experiments on multiple public datasets such as WN11, FB13, SemEval-2010 Task 8 and SQuAD. Experimental results show that KCF-NET achieves state-of-the-art results in both link prediction and MRC tasks with negligible parameter increase compared to BERT-Base, and gets competitive results in triple classification task with significantly reduced model size.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3019 - González-de-Aledo 2017
Towards a Verification Flow Across Abstraction Levels Verifying Implementations Against Their Formal Specification

González-de-Aledo, P.; Przigoda, N.; Wille, R.; Drechsler, R.; Sánchez, P.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2017;36(3):475-488

2017

DOI: 10.1109/TCAD.2016.2611494 · Ref ID: 6578

The use of formal models to describe early versions of the structure and the behavior of a system has become common practice in industry. UML and OCL are the de-facto specification languages for these tasks. They allow for capturing system properties and module behavior in an abstract but still formal fashion. At the same time, this enables designers to detect errors or inconsistencies in the initial phases of the design flow-even if the implementation has not already started. Corresponding tools for verification of formal models got established in the recent past. However, verification results are usually not reused in later design steps anymore. In fact, similar verification tasks are applied again, e.g., after the implementation has been completed. This is a waste of computational and human effort. In this paper, we address this problem by proposing a method which checks a given implementation of a system against its corresponding formal method. This allows for transferring verification results already obtained from the formal model to the implementation and, eventually, motivates a new design flow which addresses verification across abstraction levels. This paper describes the applied techniques as well as their orchestration. Afterwards, the applicability of the proposed methodology is demonstrated by means of examples as well as a case study from an industrial context.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#218 - Gonzalez-Garcia 2024
Enhancing knowledge graphs with microdata and LLMs: the case of Schema.org and Wikidata in touristic information

Gonzalez-Garcia, L.; González-Carreño, G.; Machota, A. M. R.; Fernández-Vega, J. P.

Electron. Libr. 2024;42(3):443-454

2024

DOI: 10.1108/el-06-2023-0160 · Ref ID: 3701

PurposeKnowledge graphs (KGs) are structured knowledge bases that represent real-world entities and are used in a variety of applications. Many of them are created and curated from a combination of automated and manual processes. Microdata embedded in Web pages for purposes of facilitating indexing and search engine optimization are a potential source to augment KGs under some assumptions of complementarity and quality that have not been thoroughly explored to date. In that direction, this paper aims to report results on a study that evaluates the potential of using microdata extracted from the Web to augment the large, open and manually curated Wikidata KG for the domain of touristic information. As large corpora of Web text is currently being leveraged via large language models (LLMs), these are used to compare the effectiveness of the microdata enhancement method.Design/methodology/approachThe Schema.org taxonomy was used as the source to determine the annotation types to be collected. Here, the authors focused on tourism-related pages as a case study, selecting the relevant Schema.org concepts as point of departure. The large CommonCrawl resource was used to select those annotations from a large recent sample of the World Wide Web. The extracted annotations were processed and matched with Wikidata to estimate the degree to which microdata produced for SEO might become a valuable resource to complement KGs or vice versa. The Web pages themselves can also serve as a context to produce additional metadata elements using them as context in pipelines of an existing LLMs. That way, both the annotations and the contents itself can be used as sources.FindingsThe samples extracted revealed a concentration of metadata annotations in only a few of the relevant Schema.org attributes and also revealed the possible influence of authoring tools in a significant fraction of microdata produced. The analysis of the overlapping of attributes in the sample with those of Wikidata showed the potential of the technique, limited by the disbalance of the presence of attributes. The combination of those with the use of LLMs to produce additional annotations demonstrates the feasibility of the approach in the population of existing Wikidata locations. However, in both cases, the effectiveness appears to be lower in the cases of less content in the KG, which are arguably the most relevant when considering the scenario of an automated population approach.Originality/valueThe research reports novel empirical findings on the way touristic annotations with a SEO orientation are being produced in the wild and provides an assessment of their potential to complement KGs, or reuse information from those graphs. It also provides insights on the potential of using LLMs for the task.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#231 - González 2020
Entity Linking as a Population Mechanism for Skill Ontologies: Evaluating the Use of ESCO and Wikidata

González, L.; García-Barriocanal, E.; Sicilia, M. A.

14th International Conference on Metadata and Semantic Research (MTSR) 2020;1355():116-122

Madrid, SPAIN Springer International Publishing Ag 2020

DOI: 10.1007/978-3-030-71903-6_12 · Ref ID: 3348

Ontologies or databases describing occupations in terms of competences or skills are an important resource for a number of applications. Exploiting large knowledge graphs thus becomes a promising direction to update those ontologies with entities of the latter, which may be updated faster, especially in the case of crowd-sourced resources. Here we report a first assessment of the potential of that strategy matching knowledge elements in ESCO to Wikidata using NER and document similarity models available at the Spacy NLP libraries. Results show that the approach may be effective, but the use of pre-trained language models and the short texts included with entities (labels and descriptions) does not result in sufficient quality for a fully automated process.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#779 - Gosselin 2023
SORBET: A Siamese Network for Ontology Embeddings Using a Distance-Based Regression Loss and BERT

Gosselin, F.; Zouaq, A.

22nd International Semantic Web Conference (ISWC) 2023;14265():561-578

Athens, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-47240-4_30 · Ref ID: 3381

Ontology embedding methods have been popular in recent years, especially when it comes to representation learning algorithms for solving ontology-related tasks. Despite the impact of large language models on knowledge graphs' related tasks, there has been less focus on adapting these models to construct ontology embeddings that are both semantically relevant and faithful to the ontological structure. In this paper, we present a novel ontology embedding method that encodes ontology classes into a pre-trained SBERT through random walks and then fine-tunes the embeddings using a distance-based regression loss. We benchmark our algorithm on four different datasets across two tasks and show the impact of transfer learning and our distance-based loss on the quality of the embeddings. Our results show that SORBET outperform state-of-the-art ontology embedding techniques for the performed tasks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#25 - Gottlob 2024
Artificial Intelligence and Artificial Ignorance

Gottlob, G.

32nd EACSL Annual Conference on Computer Science Logic (CSL) 2024;288():

Naples, ITALY Schloss Dagstuhl, Leibniz Center Informatics 2024

DOI: 10.4230/LIPIcs.CSL.2024.3 · Ref ID: 3638

This invited talk first delves into the division between the two primary branches of AI research: symbolic AI, which predominantly focuses on knowledge representation and logical reasoning, and sub-symbolic AI, primarily centered on machine learning employing neural networks. We explore both the notable accomplishments and the challenges encountered in each of these approaches. We provide instances where traditional deep learning encounters limitations, and we elucidate significant obstacles in achieving automated symbolic reasoning. We then discuss the recent groundbreaking advancements in generative AI, driven by language models such as ChatGPT. We showcase instances where these models excel and, conversely, where they exhibit shortcomings and produce erroneous information. We identify and illustrate five key reasons for potential failures in language models, which include: (i) information loss due to data compression, (ii) training bias, (iii) the incorporation of incorrect external data, (iv) the misordering of results, and (v) the failure to detect and resolve logical inconsistencies contained in a sequence of LLM-generated prompt-answers. Lastly, we touch upon the Chat2Data project, which endeavors to leverage language models for the automated verification and enhancement of relational databases, all while mitigating the pitfalls (i)-(v) mentioned earlier.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1603 - Gou 2023
A lightweight biomedical named entity recognition with pre-trained model

Gou, Y.; Jie, C.

2023 IEEE 3rd International Conference on Data Science and Computer Application, ICDSCA 2023 2023;():117-121

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICDSCA59871.2023.10392374 · Ref ID: 4976

Biomedical Named Entity Recognition (BioNER) is a specialized subfield of Named Entity Recognition (NER) that focuses on identifying and classifying named entities in biomedical and clinical texts. The goal of BioNER is to extract essential information, such as genes, proteins, diseases, drugs, et al., from scientific literature, electronic health records (EHRs), biomedical databases, and other biomedical text sources. The recognition and classification of these entities are crucial for various biomedical and healthcare-related tasks, including information retrieval, data integration, knowledge extraction, and drug discovery. Traditional BioNER methods typically involve rule-based approaches or machine learning algorithms, et al. These methods have been widely used before the advent of deep learning and transformer-based models. Bidirectional Encoder Representations from Transformers (BERT) is a groundbreaking transformer-based language model. It has revolutionized various natural language processing (NLP) tasks by capturing contextual information and obtain optimal results in multiple benchmarks. A lightweight BioNER optimized model from traditional BERT (LWNER) is proposed in this study, which can capture contextual information and its knowledge transfer from pre-training on large-scale text corpora without relying heavily on feature engineering and handcrafted rules. Fine-tuning BERT on biomedical-specific data helps adapt the model to the nuances and terminology of the biomedical domain. We conduct the method LWNER on BioCreative dataset, BC2GM, BC4CHEMD, BC5CDR, especially the chemical entity in BC5CDR achieve F1- score 91.3%. We construct an online web tool based on LWNER to identify the arbitrary text from scientific literatures for building knowledge graph. © 2023 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#532 - Gouidis 2024
LLM-aided Knowledge Graph construction for Zero-Shot Visual Object State Classification

Gouidis, F.; Papantoniou, K.; Papoutsakis, K.; Patkos, T.; Argyros, A.; Plexousakis, D.; Ieee

14th International Conference on Pattern Recognition Systems (ICPRS) 2024;():

London, ENGLAND Ieee 2024

DOI: 10.1109/icprs62101.2024.10677802 · Ref ID: 3132

The problem of classifying the states of objects using visual information holds great importance in both applied and theoretical contexts. This work focuses on the special case of Zero-shot Object-Agnostic State Classification (ZS-OaSC). To tackle this problem, we introduce an innovative strategy that capitalizes on the capabilities of Graph Neural Networks to learn to project semantic embeddings into visual space and on the potential of Large Language Models (LLMs) to provide rich content for constructing Knowledge Graphs (KGs). Through a comprehensive ablation study, we explore the synergies between LLMs and KGs, uncovering critical insights about their integration in the context of the ZS-OSC problem. Our proposed methodology is rigorously evaluated against current state-of-the-art (SoA) methods, demonstrating superior performance in various image datasets.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1436 - Graham 2023
Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study

Graham, S.; Yates, D.; El-Roby, A.

Open. Res. Eur. 2023;3():

2023

DOI: 10.12688/openreseurope.16003.1 · Ref ID: 5278

Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means. Copyright: © 2023 Graham S et al.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2650 - Graupner 2009
Making processes from best practice frameworks actionable

Graupner, S.; Motahari-Nezhad, H. R.; Singhal, S.; Basu, S.

2009 13th Enterprise Distributed Object Computing Conference Workshops 2009;():25-34

2009

DOI: 10.1109/EDOCW.2009.5332021 · Ref ID: 6860

Best-practice frameworks provide guidance for organizing work in business. They enable reuse of experience within a domain. However, best practice frameworks are general and usually cover broad domains. Their guidance thus is often offered at an abstract level rather than as details of actionable tasks and processes to accomplish work. This paper presents an approach to bridge the gap between the abstractions available in best practice framework and actions that have to be performed by people or systems in a repeatable manner. We identify knowledge from best practices frameworks, categorize it and represent it in the form of reusable, interpretable templates. Template interpretation guides the refinement process from general concepts of best practices frameworks into actionable concepts such as specific tasks to be performed by assigned roles. A prototype implemented to validate the approach is also described.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#604 - Gromann 2020
Neural language models for the multilingual, transcultural, and multimodal Semantic Web

Gromann, D.

Semant. Web 2020;11(1):29-39

2020

DOI: 10.3233/sw-190373 · Ref ID: 3219

A vision of a truly multilingual Semantic Web has found strong support with the Linguistic Linked Open Data community. Standards, such as OntoLex-Lemon, highlight the importance of explicit linguistic modeling in relation to ontologies and knowledge graphs. Nevertheless, there is room for improvement in terms of automation, usability, and interoperability. Neural Language Models have achieved several breakthroughs and successes considerably beyond Natural Language Processing (NLP) tasks and recently also in terms of multimodal representations. Several paths naturally open up to port these successes to the Semantic Web, from automatically translating linguistic information associated with structured knowledge resources to multimodal question-answering with machine translation. Language is also an important vehicle for culture, an aspect that deserves considerably more attention. Building on existing approaches, this article envisions joint forces between Neural Language Models and Semantic Web technologies for multilingual, transcultural, and multimodal information access and presents open challenges and opportunities in this direction.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1980 - Gromann 2019
Towards the detection and formal representation of semantic shifts in inflectional morphology

Gromann, D.; Declerck, T.

OpenAccess Series in Informatics 2019;70():

Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing 2019

DOI: 10.4230/OASIcs.LDK.2019.21 · Ref ID: 5727

Semantic shifts caused by derivational morphemes is a common subject of investigation in language modeling, while inflectional morphemes are frequently portrayed as semantically more stable. This study is motivated by the previously established observation that inflectional morphemes can be just as variable as derivational ones. For instance, the English plural “-s” can turn the fabric silk into the garments of a jockey, silks. While humans know that silk in this sense has no plural, it takes more for machines to arrive at this conclusion. Frequently utilized computational language resources, such as WordNet, or models for representing computational lexicons, like OntoLex-Lemon, have no descriptive mechanism to represent such inflectional semantic shifts. To investigate this phenomenon, we extract word pairs of different grammatical number from WordNet that feature additional senses in the plural and evaluate their distribution in vector space, i.e., pre-trained word2vec and fastText embeddings. We then propose an extension of OntoLex-Lemon to accommodate this phenomenon that we call inflectional morpho-semantic variation to provide a formal representation accessible to algorithms, neural networks, and agents. While the exact scope of the problem is yet to be determined, this first dataset shows that it is not negligible. © Dagmar Gromann and Thierry Declerck.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1842 - Gu 2024
RRE: A Relevance Relation Extraction Framework for Cross-domain Recommender System at Alipay

Gu, J.; Xu, X.; Tian, Y.; Hu, Y.; Huang, J.; Zhong, W.; Zhou, F.; Gao, L.

Proceedings - IEEE International Conference on Multimedia and Expo 2024;():

IEEE Computer Society 2024

DOI: 10.1109/ICME57554.2024.10687762 · Ref ID: 4174

Prevailing embedding-based cross-domain recommendation (CDR) techniques produce embeddings individually or transfer the overall feature distribution from one domain to another. However, in real-world applications, they might be ineffective due to semantic gap across domains, which arises from divergent purposes and descriptive styles. In this work, we aim to address this challenge between Mini Program and content channel in Alipay, the largest mobile payment platform in China. To bridge utility-oriented Mini Programs and advertisement-oriented contents, we utilize side information of entities to make the entity relevance scores trustworthy. Then we introduce a knowledge graph-based model to reduce the impact of embedding vibrating from contrastive learning and the biases from the pretrained language models. Extensive experiments conducted on a large-scale Alipay offline dataset as well as an online environment demonstrated the effectiveness of our proposed framework. © 2024 IEEE.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3340 - Gu 2023
Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

Gu, Yu; Zhang, Sheng; Usuyama, Naoto; Woldesenbet, Yonas; Wong, Cliff; Sanapathi, Praneeth; Wei, Mu; Valluri, Naveen; Strandberg, Erika; Naumann, Tristan; Poon, Hoifung

arXiv 2023;():

2023

Ref ID: 7777

Large language models (LLMs), such as GPT-4, have demonstrated remarkable capabilities across a wide range of tasks, including health applications. In this paper, we study how LLMs can be used to scale biomedical knowledge curation. We find that while LLMs already possess decent competency in structuring biomedical text, by distillation into a task-specific student model through self-supervised learning, substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access. We conduct a case study on adverse drug event (ADE) extraction, which is an important area for improving care. On standard ADE extraction evaluation, a GPT-3.5 distilled PubMedBERT model attained comparable accuracy as supervised state-of-the-art models without using any labeled data. Despite being over 1,000 times smaller, the distilled model outperformed its teacher GPT-3.5 by over 6 absolute points in F1 and GPT-4 by over 5 absolute points. Ablation studies on distillation model choice (e.g., PubMedBERT vs BioGPT) and ADE extraction architecture shed light on best practice for biomedical knowledge extraction. Similar gains were attained by distillation for other standard biomedical knowledge extraction tasks such as gene-disease associations and protected health information, further illustrating the promise of this approach.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3867 - Gu 2023
Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining

Gu, Zhouhong; Jiang, Sihang; Huang, Wenhao; Liang, Jiaqing; Feng, Hongwei; Xiao, Yanghua

arXiv 2023;():

2023

Ref ID: 7664

The model's ability to understand synonymous expression is crucial in many kinds of downstream tasks. It will make the model to better understand the similarity between context, and more robust to the synonym substitution attack. However, many Pretrained Language Model (PLM) lack synonym knowledge due to limitation of small-scale synsets and PLM's pretraining objectives. In this paper, we propose a framework called Sem4SAP to mine synsets from Open Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware pretraining for language models. We propose to coarsly filter the content in Open-KG and use the frequency information to better help the clustering process under low-resource unsupervised conditions. We expand the mined synsets by migrating core semantics between synonymous expressions.We also propose two novel and effective synonym-aware pre-training methods for injecting synonym knowledge into PLMs.Extensive experiments demonstrate that Sem4SAP can dramatically outperform the original PLMs and other baselines on ten different tasks.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3393 - Gunaratna 2021
Entity Context Graph: Learning Entity Representations fromSemi-Structured Textual Sources on the Web

Gunaratna, Kalpa; Wang, Yu; Jin, Hongxia

arXiv 2021;():

2021

Ref ID: 7446

Knowledge is captured in the form of entities and their relationships and stored in knowledge graphs. Knowledge graphs enhance the capabilities of applications in many different areas including Web search, recommendation, and natural language understanding. This is mainly because, entities enable machines to understand things that go beyond simple tokens. Many modern algorithms use learned entity embeddings from these structured representations. However, building a knowledge graph takes time and effort, hence very costly and nontrivial. On the other hand, many Web sources describe entities in some structured format and therefore, finding ways to get them into useful entity knowledge is advantageous. We propose an approach that processes entity centric textual knowledge sources to learn entity embeddings and in turn avoids the need for a traditional knowledge graph. We first extract triples into the new representation format that does not use traditional complex triple extraction methods defined by pre-determined relationship labels. Then we learn entity embeddings through this new type of triples. We show that the embeddings learned from our approach are: (i) high quality and comparable to a known knowledge graph-based embeddings and can be used to improve them further, (ii) better than a contextual language model-based entity embeddings, and (iii) easy to compute and versatile in domain-specific applications where a knowledge graph is not readily available

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#562 - Guo 2024
Memory-Enhanced Knowledge Reasoning with Reinforcement Learning

Guo, J. H.; Zhang, X. L.; Liang, K.; Zhang, G. Q.

Appl. Sci.-Basel 2024;14(7):21

2024

DOI: 10.3390/app14073133 · Ref ID: 3478

In recent years, the emergence of large-scale language models, such as ChatGPT, has presented significant challenges to research on knowledge graphs and knowledge-based reasoning. As a result, the direction of research on knowledge reasoning has shifted. Two critical issues in knowledge reasoning research are the algorithm of the model itself and the selection of paths. Most studies utilize LSTM as the path encoder and memory module. However, when processing long sequence data, LSTM models may encounter the problem of long-term dependencies, where memory units of the model may decay gradually with an increase in time steps, leading to forgetting earlier input information. This can result in a decline in the performance of the LSTM model in long sequence data. Additionally, as the data volume and network depth increase, there is a risk of gradient disappearance. This study improved and optimized the LSTM model to effectively address the problems of gradient explosion and gradient disappearance. An attention layer was employed to alleviate the issue of long-term dependencies, and ConvR embedding was used to guide path selection and action pruning in the reinforcement learning inference model. The overall model achieved excellent reasoning results.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1216 - Guo 2022
Dynamic Knowledge Integration for Natural Language Inference

Guo, M.; Chen, Y.; Xu, J.; Zhang, Y.

Proceedings - 2022 4th International Conference on Natural Language Processing, ICNLP 2022 2022;():360-364

Institute of Electrical and Electronics Engineers Inc. 2022

DOI: 10.1109/ICNLP55136.2022.00066 · Ref ID: 5483

Natural language inference (NLI) aims to determine the entailment relationship between the premise and the hypothesis. It is a fundamental but difficult problem, since there may exists serious semantic and logistic gap between the premise and the hypothesis. Despite using strong pre-trained language model (PLM), previous work performs poorly on complicated reasoning for knowledge-sensitive cases ignoring the integration of external knowledge. We propose a dynamic knowledge integration strategy for NLI, where knowledge from multiple knowledge graphs (KGs) can be dynamically integrated. For each KG, it transforms input tokens into a graph according to the connectivity of the related entities. All the graphs are encoded by a group of parallel graph neural networks (GNNs), and after each layer the intermediate results are integrated dynamically by being conditioned on the input text. This strategy also facilitates the incorporation of PLM, simply by treating the input tokens as a fully connected graph and adapting the PLM outputs as the node embeddings. Experiments on SNLI, MNLI and SciTail show that, the dynamic integration of knowledge from WordNet and ConceptNet achieves significant improvements over the strongest baseline built upon RoBERTa. © 2022 IEEE.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#556 - Guo 2022
A medical question answering system using large language models and knowledge graphs

Guo, Q.; Cao, S.; Yi, Z.

Int. J. Intell. Syst. 2022;37(11):8548-8564

2022

DOI: 10.1002/int.22955 · Ref ID: 3029

Question answering systems have become prominent in all areas, while in the medical domain it has been challenging because of the abundant domain knowledge. Retrieval based approach has become promising as large pretrained language models come forth. This study focuses on building a retrieval-based medical question answering system, tackling the challenge with large language models and knowledge extensions via graphs. We first retrieve an extensive but coarse set of answers via Elasticsearch efficiently. Then, we utilize semantic matching with pretrained language models to achieve a fine-grained ranking enhanced with named entity recognition and knowledge graphs to exploit the relation of the entities in question and answer. A new architecture based on siamese structures for answer selection is proposed. To evaluate the approach, we train and test the model on two Chinese data sets, NLPCC2017 and cMedQA. We also conduct experiments on two English data sets, TREC-QA and WikiQA. Our model achieves consistent improvement as compared to strong baselines on all data sets. Qualification studies with cMedQA and our in-house data set show that our system gains highly competitive performance. The proposed medical question answering system outperforms baseline models and systems in quantification and qualification evaluations.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1249 - Guo 2024
Enhancing Commonsense Reasoning through Entity Type Knowledge Graph Completion

Guo, Y.; Mao, C.; Yue, D.; Leng, T.

2024 5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024 2024;():298-302

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/AIEA62095.2024.10692845 · Ref ID: 4122

Entity type knowledge graphs play a crucial role in enhancing the commonsense reasoning capabilities of pre-trained models, but their incomplete information often hinders the performance of these pre-trained language models. We propose a novel method that enhances commonsense reasoning by supplementing missing entity types in knowledge graphs through the aggregation of single-hop and multi-hop neighbor information. Our approach consists of three main components: aggregating neighbor information, inferring missing entity types using both local and global reasoning, and predicting the final entity types based on a combined scoring mechanism. We demonstrate the effectiveness of our method on two widely recognized datasets, CommonsenseQA and OpenBookQA. Notably, on the OpenBookQA dataset, enhancing the BART pre-trained model with the completed entity type knowledge graph improved its accuracy from 82.8% to 87.4% compared to using the original, incomplete knowledge graph. Experimental results indicate that enriching entity type information significantly enhances the ability of pretrained models to leverage implicit commonsense knowledge, particularly in tasks requiring a deep understanding of entity relationships. © 2024 IEEE.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#419 - Gupta 2021
Knowledge Based Deep Inception Model for Web Page Classification

Gupta, A.; Bhatia, R.

J. Web Eng. 2021;20(7):2131-2167

2021

DOI: 10.13052/jwe1540-9589.2075 · Ref ID: 3273

Web Page Classification is decisive for information retrieval and management task and plays an imperative role for natural language processing (NLP) problems in web engineering. Traditional machine learning algorithms excerpt covet features from web pages whereas deep leaning algorithms crave features as the network goes deeper. Pre-trained models such as BERT attains remarkable achievement for text classification and continue to show state-of-the-art results. Knowledge Graphs can provide rich structured factual information for better language modelling and representation. In this study, we proposed an ensemble Knowledge Based Deep Inception (KBDI) approach for web page classification by learning bidirectional contextual representation using pre-trained BERT incorporating Knowledge Graph embeddings and fine-tune the target task by applying Deep Inception network utilizing parallel multi-scale semantics. Proposed ensemble evaluates the efficacy of fusing domain specific knowledge embeddings with the pre-trained BERT model. Experimental interpretation exhibit that the proposed BERT fused KBDI model outperforms benchmark baselines and achieve better performance in contrast to other conventional approaches evaluated on web page classification datasets.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#969 - Gurgurov 2024
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters

Gurgurov, D.; Hartmann, M.; Ostermann, S.

KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():63-74

Association for Computational Linguistics (ACL) 2024

Ref ID: 4346

This paper explores the integration of graph knowledge from linguistic ontologies into multilingual Large Language Models (LLMs) using adapters to improve performance for low-resource languages (LRLs) in sentiment analysis (SA) and named entity recognition (NER). Building upon successful parameter-efficient fine-tuning techniques, such as K-ADAPTER (Wang et al., 2021) and MAD-X (Pfeiffer et al., 2020), we propose a similar approach for incorporating knowledge from multilingual graphs, connecting concepts in various languages with each other through linguistic relationships, into multilingual LLMs for LRLs. Specifically, we focus on eight LRLs —Maltese, Bulgarian, Indonesian, Nepali, Javanese, Uyghur, Tibetan, and Sinhala — and employ language-specific adapters fine-tuned on data extracted from the language-specific section of ConceptNet, aiming to enable knowledge transfer across the languages covered by the knowledge graph. We compare various fine-tuning objectives, including standard Masked Language Modeling (MLM), MLM with full-word masking, and MLM with targeted masking, to analyze their effectiveness in learning and integrating the extracted graph data. Through empirical evaluation on language-specific tasks, we assess how structured graph knowledge affects the performance of multilingual LLMs for LRLs in SA and NER, providing insights into the potential benefits of adapting language models for low-resource scenarios. ©2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3497 - Gutiérrez 2024
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models

Gutiérrez, Bernal Jiménez; Shu, Yiheng; Gu, Yu; Yasunaga, Michihiro; Su, Yu

arXiv 2024;():

2024

Ref ID: 8313

In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%. Single-step retrieval with HippoRAG achieves comparable or better performance than iterative retrieval like IRCoT while being 10-30 times cheaper and 6-13 times faster, and integrating HippoRAG into IRCoT brings further substantial gains. Finally, we show that our method can tackle new types of scenarios that are out of reach of existing methods. Code and data are available at https://github.com/OSU-NLP-Group/HippoRAG.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1859 - Hahn 2019
Self-knowledge distillation in natural language processing

Hahn, S.; Choi, H.

International Conference Recent Advances in Natural Language Processing, RANLP 2019;2019-September():423-430

Incoma Ltd 2019

DOI: 10.26615/978-954-452-056-4_050 · Ref ID: 5773

Since deep learning became a key player in natural language processing (NLP), many deep learning models have been showing remarkable performances in a variety of NLP tasks, and in some cases, they are even outperforming humans. Such high performance can be explained by efficient knowledge representation of deep learning models. While many methods have been proposed to learn more efficient representation, knowledge distillation from pretrained deep networks suggest that we can use more information from the soft target probability to train other neural networks. In this paper, we propose a new knowledge distillation method self-knowledge distillation, based on the soft target probabilities of the training model itself, where multimode information is distilled from the word embedding space right below the softmax layer. Due to the time complexity, our method approximates the soft target probabilities. In experiments, we applied the proposed method to two different and fundamental NLP tasks: language model and neural machine translation. The experiment results show that our proposed method improves performance on the tasks. © 2019 Association for Computational Linguistics (ACL). All rights reserved.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1268 - Han 2023
Enhancing the Effect of BERT Model in the Medical Field Based on the Knowledge Graph

Han, X.; Zhang, L.

Proceedings of SPIE - The International Society for Optical Engineering 2023;12724():

SPIE 2023

DOI: 10.1117/12.2687418 · Ref ID: 5290

Knowledge graph is a kind of knowledge representation, which captures information about entities, entity attributes and relationships between entities in a structured way, and is widely used in intelligent retrieval, recommendation systems, intelligent question answering, etc. Knowledge map is a graphical representation of the relationship between different concepts and topics in a specific field, while BERT is a most advanced language model that can understand the context and meaning of words in text. By combining the two methods of medical knowledge graph (MKG) and the bidirectional encoder representation of BERT model, it shows hope in improving medical information retrieval and decision-making, and can create a more comprehensive and accurate representation of medical knowledge, which can be used to guide clinical decision-making and improve the prognosis of patients, and ultimately improve the effect of BERT in the medical field. © 2023 SPIE.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#2881 - Hao 2023
Semantic Comprehension Method for Chinese Sentences Based on Minimal Semantic Structures and Its Application

Hao, W.; Qianru, H.

Chinese Journal of Electronics 2023;32(3):613-624

2023

DOI: 10.23919/cje.2021.00.161 · Ref ID: 6421

The importance of small Chinese sentences is no less than that of sentences, which is an inherent feature of Chinese itself. According to this characteristic, this paper proposes a sentence semantic understanding method for Chinese scientific and technological abstracts based on the minimum semantic structure. Firstly, a conceptual model was established for identifying the minimum semantic structure of a sentence based on a corpus of verbs, relative words, prepositions and markers based on Language Technology Planform (LTP) tools. Secondly, the model was used to extract the minimum semantic structure of abstract sentence. Finally, three experiments were carried out, namely, the classification of the abstract sentences, knowledge graph generation and automatic semantic inference discovery. Our study confirmed the practical value of the small Chinese sentence. The experimental results show that the effect of using small sentences to understand the semantics of Chinese text is better than that of the full stop sentence, and the minimum semantic structure can be used as the basic unit of the Chinese sentence semantic comprehension. This method is conducive in the automatic understanding of the basic semantics of sentences in unstructured Chinese science and technology text sentences.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3070 - Haque 2024
Utilizing Structural Metrics from Knowledge Graphs to Enhance the Robustness Quantification of Large Language Models (Extended Abstract)

Haque, M. A.; Kamal, M.; George, R.; Gupta, K. D.

2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA) 2024;():1-2

2024

DOI: 10.1109/DSAA61799.2024.10722791 · Ref ID: 6087

The goal of this study is to determine whether large language models (LLMs) like CodeLlama, Mistral, and Vicuna can be used to build knowledge graphs (KGs) from textual data. We create class descriptions for well-known KGs such as DBpedia, YAGO, and Google Knowledge Graph, from which we extract RDF triples and enhance these graphs using different preprocessing methods. Six structural quality measures are used in the study to compare the constructed and existing KGs. Our results demonstrate how important LLMs are to improving KG construction and provide insightful information for KG construction researchers. Moreover, an in-depth analysis of popular open-source LLM models enables researchers to identify the most efficient model for various tasks, ensuring optimal performance in specific applications.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#695 - Harper 2022
Question Answering with Additive Restrictive Training (QuAART): Question Answering for the Rapid Development of New Knowledge Extraction Pipelines

Harper, C. A.; Daniel, R.; Groth, P.

23rd International Conference on Knowledge Engineering and Knowledge Management (EKAW) 2022;13514():51-65

Bozen Bolzano, ITALY Springer International Publishing Ag 2022

DOI: 10.1007/978-3-031-17105-5_4 · Ref ID: 3786

Numerous studies have explored the use of language models and question answering techniques for knowledge extraction. In most cases, these models are trained on data specific to the new task at hand. We hypothesize that using models trained only on generic question answering data (e.g. SQuAD) is a good starting point for domain specific entity extraction. We test this hypothesis, and explore whether the addition of small amounts of training data can help lift model performance. We pay special attention to the use of null answers and unanswerable questions to optimize performance. To our knowledge, no studies have been done to evaluate the effectiveness of this technique. We do so for an end-to-end entity mention detection and entity typing task on HAnDS and FIGER, two common evaluation datasets for fine grained entity recognition. We focus on fine-grained entity recognition because it is challenging scenario, and because the long tail of types in this task highlights the need for entity extraction systems that can deal with new domains and types. To our knowledge, we are the first system beyond those presented in the original FIGER and HAnDS papers to tackle the task in an end-to-end fashion. Using an extremely small sample from the distantly-supervised HAnDS training data - 0.0015%, or less than 500 passages randomly chosen out of 31 million - we produce a CoNNL F1 score of 73.72 for entity detection on FIGER. Our end-to-end detection and typing evaluation produces macro and micro Fls of 45.11 and 54.75, based on the FIGER evaluation metrics. This work provides a foundation for the rapid development of new knowledge extraction pipelines.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#39 - Harrer 2023
Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine

Harrer, S.

EBioMedicine 2023;90():12

2023

DOI: 10.1016/j.ebiom.2023.104512 · Ref ID: 3678

Large Language Models (LLMs) are a key component of generative artificial intelligence (AI) applications for creating new content including text, imagery, audio, code, and videos in response to textual instructions. Without human oversight, guidance and responsible design and operation, such generative AI applications will remain a party trick with substantial potential for creating and spreading misinformation or harmful and inaccurate content at unprecedented scale. However, if positioned and developed responsibly as companions to humans augmenting but not replacing their role in decision making, knowledge retrieval and other cognitive processes, they could evolve into highly efficient, trustworthy, assistive tools for information management. This perspective describes how such tools could transform data management workflows in healthcare and medicine, explains how the underlying technology works, provides an assessment of risks and limitations, and proposes an ethical, technical, and cultural framework for responsible design, development, and deployment. It seeks to incentivise users, developers, providers, and regulators of generative AI that utilises LLMs to collectively prepare for the transformational role this technology could play in evidence-based sectors.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#61 - He 2020
BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models

He, B.; Zhou, D.; Xiao, J. H.; Jiang, X.; Liu, Q.; Yuan, N. J.; Xu, T.

Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2020;():2281-2290

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3159

Complex node interactions are common in knowledge graphs (KGs), and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs. Traditional knowledge representation learning (KRL) methods usually treat a single triple as a training unit, neglecting the usage of graph contextualized knowledge. To utilize these unexploited graph-level knowledge, we propose an approach to model subgraphs in a medical KG. Then, the learned knowledge is integrated with a pre-trained language model to do the knowledge generalization. Experimental results demonstrate that our model achieves the state-of-the-art performance on several medical NLP tasks, and the improvement above MedERNIE indicates that graph contextualized knowledge is beneficial.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1650 - He 2024
MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models

He, J.; Liu, J.; Wang, L.; Li, X.; Xu, X.

Proceedings - IEEE International Conference on Multimedia and Expo 2024;():

IEEE Computer Society 2024

DOI: 10.1109/ICME57554.2024.10687798 · Ref ID: 4147

Knowledge Graph Completion (KGC) aims to conduct reasoning on the facts within knowledge graphs and automatically infer missing links. Existing methods can mainly be categorized into structure-based or description-based. Structure-based methods effectively represent relational facts in knowledge graphs using entity embeddings and description-based methods leverage pre-trained language models (PLMs) to understand textual information. In this paper, we propose Momentum Contrast for knowledge graph completion with Structure-Augmented pre-trained language models (MoCoSA), which allows the PLM to perceive the structural information by the adaptable structure encoder. We proposed momentum hard negative and intra-relation negative sampling to improve learning efficiency. Experimental results demonstrate that our approach achieves state-of-the-art performance in terms of mean reciprocal rank (MRR), with improvements of 2.5% on WN18RR and 21% on OpenBG500. © 2024 IEEE.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1550 - He 2023
LAL-JER: Label-Aware Learning for Adaptive Joint Entity and Relation Extraction with LLM data augmentation

He, M.; Bai, Y.

ACM International Conference Proceeding Series 2023;():414-419

Association for Computing Machinery 2023

DOI: 10.1145/3640912.3640993 · Ref ID: 4767

Joint entity and relation extraction has achieved great improvements in Natural Language Processing (NLP) and has been widely applied, such as constructing knowledge graph, query understanding and question answering. Existing methods usually spend long time on fitting the models on certain datasets with given label type, which greatly lacks the ability of generalization. The model cannot make prediction on label types that have not seen in the training set. To address this issue, we propose to use prompt to incorporate the semantic meaning of the label type description. Furthermore, we use large language model to perform data augmentation to improve the robustness of our model during training. Extensive experiments and ablation study on two joint entity and relation extraction validates the effectiveness of our work on that: 1. Our methods achieved states of art performance on joint entity and relation extraction benchmark based on pretrained language model bert. 2. Our methods can help the model make predictions on label type unseen before given prompts. © 2023 ACM.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3743 - He 2020
On the Role of Conceptualization in Commonsense Knowledge Graph Construction

He, Mutian; Song, Yangqiu; Xu, Kun; Yu, Dong

arXiv 2020;():

2020

Ref ID: 7390

Commonsense knowledge graphs (CKGs) like Atomic and ASER are substantially different from conventional KGs as they consist of much larger number of nodes formed by loosely-structured text, which, though, enables them to handle highly diverse queries in natural language related to commonsense, leads to unique challenges for automatic KG construction methods. Besides identifying relations absent from the KG between nodes, such methods are also expected to explore absent nodes represented by text, in which different real-world things, or entities, may appear. To deal with the innumerable entities involved with commonsense in the real world, we introduce to CKG construction methods conceptualization, i.e., to view entities mentioned in text as instances of specific concepts or vice versa. We build synthetic triples by conceptualization, and further formulate the task as triple classification, handled by a discriminatory model with knowledge transferred from pretrained language models and fine-tuned by negative sampling. Experiments demonstrate that our methods can effectively identify plausible triples and expand the KG by triples of both new nodes and edges of high diversity and novelty.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#486 - He 2023
KRP-DS: A Knowledge Graph-Based Dialogue System with Inference-Aided Prediction

He, Q.; Xu, S. B.; Zhu, Z. F.; Wang, P.; Li, K. F.; Zheng, Q. F.; Li, Y. S.

Sensors 2023;23(15):13

2023

DOI: 10.3390/s23156805 · Ref ID: 3306

With the popularity of ChatGPT, there has been increasing attention towards dialogue systems. Researchers are dedicated to designing a knowledgeable model that can engage in conversations like humans. Traditional seq2seq dialogue models often suffer from limited performance and the issue of generating safe responses. In recent years, large-scale pretrained language models have demonstrated their powerful capabilities across various domains. Many studies have leveraged these pretrained models for dialogue tasks to address concerns such as safe response generation. Pretrained models can enhance responses by carrying certain knowledge information after being pre-trained on large-scale data. However, when specific knowledge is required in a particular domain, the model may still generate bland or inappropriate responses, and the interpretability of such models is poor. Therefore, in this paper, we propose the KRP-DS model. We design a knowledge module that incorporates a knowledge graph as external knowledge in the dialogue system. The module utilizes contextual information for path reasoning and guides knowledge prediction. Finally, the predicted knowledge is used to enhance response generation. Experimental results show that our proposed model can effectively improve the quality and diversity of responses while having better interpretability, and outperforms baseline models in both automatic and human evaluations.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#835 - He 2021
Towards Solving the Winograd Schema Challenge: Model-Free, Model-Based and a Spectrum in Between

He, W. N.; Xiao, Z. H.

14th International Conference on Knowledge Science, Engineering, and Management (KSEM) 2021;12816():126-138

Tokyo, JAPAN Springer International Publishing Ag 2021

DOI: 10.1007/978-3-030-82147-0_11 · Ref ID: 3689

The Winograd Schema Challenge (WSC) has attracted much attention recently as common sense is recognized to be not only the key to human-level intelligence but also a bottleneck faced by recent progress. Although neural language models (LMs) have achieved state-of-the-art (SOTA) performance on WSC, they fall short on interpretability and robustness against adversarial attacks. Contrarily, methods with structured representation and explicit reasoning suffer from the difficulty of knowledge acquisition and the rigidness of representation. In this paper, we look back on the current model-free and model-based approaches, pointing out the missing ingredients towards solving the WSC. We report our preliminary exploration of formalizing the WSC problems using a variant of first-order language and our first-hand findings of indispensable capabilities of human-level commonsense reasoning. The issues we encounter suggest that a full spectrum of representation tools and reasoning abilities are called for.

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#3966 - Heim 2024
Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets

Heim, Desiree; Jilek, Christian; Ulges, Adrian; Dengel, Andreas

arXiv 2024;():

2024

Ref ID: 8582

Current publicly available knowledge work data collections lack diversity, extensive annotations, and contextual information about the users and their documents. These issues hinder objective and comparable data-driven evaluations and optimizations of knowledge work assistance systems. Due to the considerable resources needed to collect such data in real-life settings and the necessity of data censorship, collecting such a dataset appears nearly impossible. For this reason, we propose a configurable, multi-agent knowledge work dataset generator. This system simulates collaborative knowledge work among agents producing Large Language Model-generated documents and accompanying data traces. Additionally, the generator captures all background information, given in its configuration or created during the simulation process, in a knowledge graph. Finally, the resulting dataset can be utilized and shared without privacy or confidentiality concerns. This paper introduces our approach's design and vision and focuses on generating authentic knowledge work documents using Large Language Models. Our study involving human raters who assessed 53% of the generated and 74% of the real documents as realistic demonstrates the potential of our approach. Furthermore, we analyze the authenticity criteria mentioned in the participants' comments and elaborate on potential improvements for identified common issues.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2604 - Helali 2024
KGLiDS: A Platform for Semantic Abstraction, Linking, and Automation of Data Science

Helali, M.; Monjazeb, N.; Vashisth, S.; Carrier, P.; Helal, A.; Cavalcante, A.; Ammar, K.; Hose, K.; Mansour, E.

2024 IEEE 40th International Conference on Data Engineering (ICDE) 2024;():179-192

2024

DOI: 10.1109/ICDE60146.2024.00021 · Ref ID: 6233

In recent years, we have witnessed the growing interest from academia and industry in applying data science technologies to analyze large amounts of data. In this process, a myriad of artifacts (datasets, pipeline scripts, etc.) are created. However, there has been no systematic attempt to holistically collect and exploit all the knowledge and experiences that are implicitly contained in those artifacts. Instead, data scientists recover information and expertise from colleagues or learn via trial and error. Hence, this paper presents a scalable platform, KGLiDS, that employs machine learning and knowledge graph technologies to abstract and capture the semantics of data science artifacts and their connections. Based on this information, KGLiDS enables various downstream applications, such as data discovery and pipeline automation. Our comprehensive evaluation covers use cases in data discovery, data cleaning, transformation, and AutoML. It shows that KGLiDS is significantly faster with a lower memory footprint than the state-of-the-art systems while achieving comparable or better accuracy.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2898 - Henson 2012
Semantic Perception: Converting Sensory Observations to Abstractions

Henson, C.; Sheth, A.; Thirunarayan, K.

IEEE Internet Computing 2012;16(2):26-34

2012

DOI: 10.1109/MIC.2012.20 · Ref ID: 6540

An abstraction is a representation of an environment derived from sensor observation data. Generating an abstraction requires inferring explanations from an incomplete set of observations (often from the Web) and updating these explanations on the basis of new information. This process must be fast and efficient. The authors' approach overcomes these challenges to systematically derive abstractions from observations. The approach models perception through the integration of an abductive logic framework called Parsimonious Covering Theory with Semantic Web technologies. The authors demonstrate this approach's utility and scalability through use cases in the healthcare and weather domains.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3342 - Heo 2024
Do LLMs "know" internally when they follow instructions?

Heo, Juyeon; Heinze-Deml, Christina; Elachqar, Oussama; Ren, Shirley; Nallasamy, Udhay; Miller, Andy; Chan, Kwan Ho Ryan; Narain, Jaya

arXiv 2024;():

2024

Ref ID: 8729

Instruction-following is crucial for building AI agents with large language models (LLMs), as these models must adhere strictly to user-provided constraints and guidelines. However, LLMs often fail to follow even simple and clear instructions. To improve instruction-following behavior and prevent undesirable outputs, a deeper understanding of how LLMs' internal states relate to these outcomes is required. Our analysis of LLM internal states reveal a dimension in the input embedding space linked to successful instruction-following. We demonstrate that modifying representations along this dimension improves instruction-following success rates compared to random changes, without compromising response quality. Further investigation reveals that this dimension is more closely related to the phrasing of prompts rather than the inherent difficulty of the task or instructions. This discovery also suggests explanations for why LLMs sometimes fail to follow clear instructions and why prompt engineering is often effective, even when the content remains largely unchanged. This work provides insight into the internal workings of LLMs' instruction-following, paving the way for reliable LLM agents.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1627 - Hertling 2021
Matching with Transformers in MELT

Hertling, S.; Portisch, J.; Paulheim, H.

CEUR Workshop Proceedings 2021;3063():13-24

CEUR-WS 2021

Ref ID: 5649

One of the strongest signals for automated matching of on- tologies and knowledge graphs are the textual descriptions of the con- cepts. The methods that are typically applied (such as character- or token-based comparisons) are relatively simple, and therefore do not cap- ture the actual meaning of the texts. With the rise of transformer-based language models, text comparison based on meaning (rather than lexical features) is possible. In this paper, we model the ontology matching task as classification problem and present approaches based on transformer models. We further provide an easy to use implementation in the MELT framework which is suited for ontology and knowledge graph matching. We show that a transformer-based filter helps to choose the correct cor- respondences given a high-recall alignment and already achieves a good result with simple alignment post-processing methods.3 Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1346 - Hitzler 2022
Generalizable Neuro-Symbolic Systems for Commonsense Question Answering

Hitzler, P.; Sarker, M. K.; Oltramari, A.; Francis, J.; Ilievski, F.; Ma, K.; Mirzaee, R.

Front. Artif. Intell. Appl. 2022;342():294-310

IOS Press BV 2022

DOI: 10.3233/FAIA210360 · Ref ID: 5581

This chapter illustrates how suitable neuro-symbolic models for language understanding can enable domain generalizability and robustness in downstream tasks. Different methods for integrating neural language models and knowledge graphs are discussed. The situations in which this combination is most appropriate are characterized, including quantitative evaluation and qualitative error analysis on a variety of commonsense question answering benchmark datasets. © 2022 The authors and IOS Press. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3156 - Hoang 2024
Semi-automated Construction of&nbsp;Complex Knowledge Base Question Answering Dataset Using Large Language Model

Hoang, Lily; Liausvia, Fiona; Liu, Yan; Nguyen, Thanh-Son

Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part V 2024;():230–248

Vilnius, Lithuania Springer-Verlag 2024

DOI: 10.1007/978-3-031-70362-1_14 · Ref ID: 7148

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1977 - Hofer 2024
Towards self-configuring Knowledge Graph Construction Pipelines using LLMs - A Case Study with RML

Hofer, M.; Frey, J.; Rahm, E.

CEUR Workshop Proceedings 2024;3718():

CEUR-WS 2024

Ref ID: 4566

This paper explores using large language models (LLMs) to generate RDF mapping language (RML) files in the RDF turtle format as a key step towards self-configuring RDF knowledge graph construction pipelines. Our case study involves mapping a subset of the Internet Movie Database (IMDB) in JSON format given a target Movie ontology (selection of DBpedia Ontology OWL statements). We define and compute several scores to assess both the generated mapping files and the resulting graph using a manually created reference. Our findings demonstrate the promising potential of the state-of-the-art commercial LLMs in a zero-shot scenario. © 2024 Copyright for this paper by its authors.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#3752 - Hogan 2022
An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods

Hogan, William

arXiv 2022;():

2022

Ref ID: 7564

Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2747 - Hogl 2001
On supporting medical quality with intelligent data mining

Hogl, O.; Muller, M.; Stoyan, H.; Stuhlinger, W.

Proceedings of the 34th Annual Hawaii International Conference on System Sciences 2001;():10 pp.

2001

DOI: 10.1109/HICSS.2001.926557 · Ref ID: 6308

The healthcare sector is currently facing both the economic necessity and the technical opportunity of a data based approach to quality management. Against this background, we introduce a process model for a data based medical quality management and apply intelligent data mining methods to patient data. Intelligent data mining incorporates advantages of both knowledge acquisition from data and from experts. We present the Knowledge Discovery Question Language (KDQL), a controlled language for business questions which abstracts from database and data mining terminology to allow high-level interaction. We use a knowledge-based measurement of relevant subjective interestingness facets like novelty, usefulness, and understandability which enables flexible ways to access the results of data mining. Questions asked in this project were targeted on diagnostic and therapeutic measures as well as the quality of documentation. For these issues in the field of medical quality management interesting results were found.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#358 - Hong 2021
Improving Relation Extraction by Knowledge Representation Learning

Hong, W. X.; Li, S. Y.; Hu, Z. Q.; Rasool, A.; Jiang, Q. S.; Weng, Y.; Soc, Ieee Comp

IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) 2021;():1211-1215

Electr Network Ieee Computer Soc 2021

DOI: 10.1109/ictai52525.2021.00191 · Ref ID: 3672

Relation extraction is an important NLP task to extract the semantic relationship between two entities. Recently, large-scale pre-training language models have achieved excellent performance in many NLP applications. Most of the existing relation extraction models mainly rely on context information, but entity information is also very important for relation extraction, especially domain knowledge of entity and the direction between entity pairs. In this paper, based on the pre-trained BERT model, we propose a multi-task joint relation extraction model incorporating knowledge representation learning(KRL). The experimental results on the SemEval 2010 task 8 dataset and the KBP37 dataset show that our proposed model outperforms most of state-of-the-art methods. The results on the larger dataset FewRe180 refined from FewRel also indicate that increasing the knowledge representation learning as an auxiliary objective is helpful for the relation extraction task.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2717 - Hoogs 2001
Multi-modal fusion for video understanding

Hoogs, A.; Mundy, J.; Cross, G.

Proceedings 30th Applied Imagery Pattern Recognition Workshop (AIPR 2001). Analysis and Understanding of Time Varying Imagery 2001;():103-108

2001

DOI: 10.1109/AIPR.2001.991210 · Ref ID: 6932

The exploitation of semantic information in computer vision problems can be difficult because of the large difference in representations and levels of knowledge. Image analysis is formulated in terms of low-level features describing image structure and intensity, while high-level knowledge such as purpose and common sense are encoded in abstract, non-geometric representations. In this work we attempt to bridge this gap through the integration of image analysis algorithms with WordNet, a large semantic network that explicitly links related words in a hierarchical structure. Our problem domain is the understanding of broadcast news, as this provides both linguistic information in the transcript and video information. Visual detection algorithms such as face detection and object tracking are applied to the video to extract basic object information, which is indexed into WordNet. The transcript provides topic information in the form of detected keywords. Together, both types of information are used to constrain a search within WordNet for a description of the video content in terms of the most likely WordNet concepts. This project is in its early stages; the general ideas and concepts are presented here.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1406 - Hoppe 2022
Improving Zero-Shot Text Classification with Graph-based Knowledge Representations

Hoppe, F.

CEUR Workshop Proceedings 2022;3165():

CEUR-WS 2022

Ref ID: 5568

Insufficient training data is a key challenge for text classification. In particular, long-tail class distributions and emerging, new classes do not provide any training data for specific classes. Therefore, such a zeroshot setting must incorporate additional, external knowledge to enable transfer learning by connecting the external knowledge of previously unseen classes to texts. Recent zero-shot text classifier utilize only distributional semantics defined by large language models and based on class names or natural language descriptions. This implicit knowledge contains ambiguities, is not able to capture logical relations nor is it an efficient representation of factual knowledge. These drawbacks can be avoided by introducing explicit, external knowledge. Especially, knowledge graphs provide such explicit, unambiguous, and complementary, domain specific knowledge. Hence, this thesis explores graph-based knowledge as additional modality for zero-shot text classification. Besides a general investigation of this modality, the influence on the capabilities of dealing with domain shifts by including domain-specific knowledge is explored. © 2022 Copyright for this paper by its authors.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3069 - Hossain 2023
Utilizing GloVe Embeddings for Deep Learning-Based Analysis of Research Paper Abstracts

Hossain, A.; Konok, U. H.; Islam, R.; Ruhani, R. M. Karmol; Musfikin, R.; Uddin, M. M.; Khan, M. S. Hossain; Tuhin, R. A.

2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) 2023;():1-6

2023

DOI: 10.1109/HORA58378.2023.10156746 · Ref ID: 6166

Researchers are finding it harder and harder to locate relevant articles as the body of scientific literature expands at an exponential rate. Due to the sheer volume of publications, manual classification and categorization of these articles is no longer possible. By addressing the task of accurately classifying research papers based on their abstracts, the purpose of this paper is to address the task of improving the proposal and search procedures for efficient academic information retrieval. While ordering papers in computer science, mathematics, physics, and statistics, the models accomplish high precision, accuracy, recall, and F1-score by using profound learning calculations (LSTM, GRU, Bi-LSTM, and Bi-GRU) and GloVe word embeddings to catch semantic data. The LSTM, Bi-LSTM, GRU and Bi-GRU models were used to accurately classify the abstracts of research papers into computer science, mathematics, physics, and statistics. These models performed well concerning accuracy, recall, and F1-score, as well as accomplishing high precision. Automatic categorization of research papers was made possible by combining GloVe word embeddings with deep learning algorithms, which sped up information search and knowledge discovery. These models can help academic researchers and practitioners streamline the process of categorizing research papers and boost their research efforts.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#60 - Hou 2020
BERT-Based Chinese Relation Extraction for Public Security

Hou, J. Q.; Li, X.; Yao, H. P.; Sun, H. C.; Mai, T. L.; Zhu, R. C.

IEEE Access 2020;8():132367-132375

2020

DOI: 10.1109/access.2020.3002863 · Ref ID: 3512

The past few years have witnessed some public safety incidents occurring around the world. With the advent of the big data era, effectively extracting public security information from the internet has become of great significance. Up to hundreds of TBs of data are injected into the network every second, and thus it is impossible to process them manually. Natural Language Processing (NLP) is dedicated to the development of an intelligent system for effective text information mining. By analysing the text and quickly extracting the relationships between the relevant entities, NLP can establish the knowledge graph (KG) of public security, which lays the foundation for safety case analysis, information monitoring, and activity tracking and locating. One of the current pre-training relation extraction models is the Word2Vec model. The Word2vec model is single mapped, and it produces a static, single representation of the words in sentences. Then, the BERT model considers contextual information and provides more dynamic, richer vector representations of generated words. Therefore, in this paper, we propose a Bidirectional Encoder Representation from Transformers (BERT) based on the Chinese relation extraction algorithm for public security, which can effectively mine security information. The BERT model is obtained by training the Masked Language Model and predicting the next sentence task, which is based on the Transformer Encoder and the main model structure is the stacked Transformers. Extensive simulations are conducted to evaluate our proposed algorithm in comparison to some state-of-the-art schemes.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2029 - Hou 2022
What Has Been Enhanced in my Knowledge-Enhanced Language Model?

Hou, Y.; Fu, G.; Sachan, M.

Findings of the Association for Computational Linguistics: EMNLP 2022 2022;():1417-1438

Association for Computational Linguistics (ACL) 2022

Ref ID: 5482

A number of knowledge integration (KI) methods have recently been proposed to incorporate external knowledge into pretrained language models (LMs). Even though knowledge-enhanced LMs outperform base LMs on knowledge-intensive tasks, the inner-workings of these KI methods are not well-understood. For instance, it is unclear which knowledge is effectively integrated into knowledge-enhanced LMs and which is not; and if such integration leads to catastrophic forgetting of already learned knowledge. We show that existing model interpretation methods such as linear probes and prompts have some key limitations in answering these questions. We revisit KI from an information-theoretic view and propose a new theoretically sound probe called Graph Convolution Simulator (GCS) for KI interpretation. GCS uses graph attention on the corresponding knowledge graph for interpretation. In our experiments we verify that GCS can provide reasonable interpretation results for two well-known knowledge-enhanced LMs: ERNIE and K-Adapter. We also find that only a marginal amount of knowledge is successfully integrated in these models, and simply increasing the size of the KI corpus may not lead to better knowledge-enhanced LMs. © 2022 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#66 - Hou 2024
Bibliometric Analysis on the Research of Geoscience Knowledge Graph (GeoKG) from 2012 to 2023

Hou, Z. W.; Liu, X. L.; Zhou, S. N.; Jing, W. L.; Yang, J.

ISPRS Int. J. Geo-Inf. 2024;13(7):21

2024

DOI: 10.3390/ijgi13070255 · Ref ID: 3237

The geoscience knowledge graph (GeoKG) has gained worldwide attention due to its ability in the formal representation of spatiotemporal features and relationships of geoscience knowledge. Currently, a quantitative review of the state and trends in GeoKG is still scarce. Thus, a bibliometric analysis was performed in this study to fill the gap. Specifically, based on 294 research articles published from 2012 to 2023, we conducted analyses in terms of the (1) trends in publications and citations; (2) identification of the major papers, sources, researchers, institutions, and countries; (3) scientific collaboration analysis; and (4) detection of major research topics and tendencies. The results revealed that the interest in GeoKG research has rapidly increased after 2019 and is continually expanding. China is the most productive country in this field. Co-authorship analysis shows that inter-national and inter-institutional collaboration should be reinforced. Keyword analysis indicated that geoscience knowledge representation, information extraction, GeoKG construction, and GeoKG-based multi-source data integration were current hotspots. In addition, several important but currently neglected issues, such as the integration of Large Language Models, are highlighted. The findings of this review provide a systematic overview of the development of GeoKG and provide a valuable reference for future research.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2274 - Hsiao 2007
Constructing Human Brain-Function Association Models from fMRI Literature

Hsiao, M. Y.; Chen, D. Y.; Chen, J. H.

2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2007;():1188-1191

2007

DOI: 10.1109/IEMBS.2007.4352509 · Ref ID: 6272

Toward the goal of understanding the human brain function, we have developed a web-based human brain functional mapping knowledge base (HBFMKB) system to mining human brain-function association model from vast Medline abstracts. Since nomenclature and relationships among cognitive functions have no consensus yet, we use rule-based natural language processing methods to extract behavioral task and cognitive function and do n-gram approximate concept mapping by the unified medical language system (UMLS) knowledge source. The HBFMKB system has an automatic PubMed MEDLINE download and import system, name entity extraction system and interactive visualization system. In summary, the HBFMKB system helps scientists to get digest knowledge before design experiments and compare their results with current literature.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3600 - Hu 2024
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models

Hu, Chenhui; Cao, Pengfei; Chen, Yubo; Liu, Kang; Zhao, Jun

arXiv 2024;():

2024

Ref ID: 8529

Knowledge editing aims to update outdated or incorrect knowledge in large language models (LLMs). However, current knowledge editing methods have limited scalability for lifelong editing. This study explores the fundamental reason why knowledge editing fails in lifelong editing. We begin with the closed-form solution derived from linear associative memory, which underpins state-of-the-art knowledge editing methods. We extend the solution from single editing to lifelong editing, and through rigorous mathematical derivation, identify an interference term in the final solution, suggesting that editing knowledge may impact irrelevant knowledge. Further analysis of the interference term reveals a close relationship with superposition between knowledge representations. When knowledge superposition does not exist in language models, the interference term vanishes, allowing for lossless knowledge editing. Experiments across numerous language models reveal that knowledge superposition is universal, exhibiting high kurtosis, zero mean, and heavy-tailed distributions with clear scaling laws. Ultimately, by combining theory and experiments, we demonstrate that knowledge superposition is the fundamental reason for the failure of lifelong editing. Moreover, this is the first study to investigate knowledge editing from the perspective of superposition and provides a comprehensive observation of superposition across numerous real-world language models. Code available at https://github.com/ChenhuiHu/knowledge_in_superposition.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2000 - Hu 2023
Two-stage open information extraction method for the defence technology field

Hu, M.; Wang, F.; Xu, X.; Luo, W.; Liu, X.; Luo, Z.; Tan, Y.

Qinghua Daxue Xuebao 2023;63(9):1309-1316

2023

DOI: 10.16511/j.cnki.qhdxxb.2023.21.010 · Ref ID: 5298

[Objective] The abundant information resources available on the internet about defense technology are of vital importance as data sources for obtaining high-value military intelligence. The aim of open information extraction in the field of defense technology is to extract structured triplets containing subject, predicate, object, and other arguments from the massive amount of information available on the internet. This technology has important implications for ontology induction and the construction of knowledge graphs in the defense technology domain. However, while information extraction experiments in the general domain yield good results, open information extraction in the defense technology domain faces several challenges, such as a lack of domain annotated data, arguments overlapping unadaptability, and unrecognizable long entities.[Methods] In this paper, an annotation strategy is proposed based on the entity boundaries, and an annotated dataset in the defense technology field combined with the experience of domain experts was constructed. Furthermore, a two-stage open information extraction method is proposed in the defense technology field that utilizes a pretrained language model-based sequence labeling algorithm to extract predicates and a multihead attention mechanism to learn the prediction of argument boundaries. In the first stage, the input sentence was converted into an input sequence < [CLS], input sentence [SEP] >, and the input sequence was encoded using a pretrained language model to obtain an implicit state representation of the input sequence. Based on this sentence representation, a conditional random field (CRF) layer was used to predict the position of the predicates, i. e., to predict the BIO labels of the words. In the second stage, the predicated predicates from the first stage were concatenated with the original sentence and converted into an input sequence < [CLS], predicate [SEP], and input sentence [SEP] >, which was encoded using a pretrained language model to obtain an implicit state representation of the input sequence. This representation was then fed to a multihead pointer network to predict the position of the argument. The predicted position was tagged with the actual position to calculate the cross-entropy loss function. Finally, the predicates and the arguments predicted by the predicate and argument extraction models were combined to obtain the complete triplet.[Results] The experimental results from the extensive experiments conducted on a self-built annotated dataset in the defense technology field reveal the following. (1) In predicate extraction, our method achieved a 3. 92% performance improvement in the Fl value as compared to LSTM methods and more than 10% performance improvement as compared to syntactic analysis methods. (2) In argument extraction, our method achieved a considerable performance improvement of more than 16% in the Fl value as compared to LSTM methods and about 11% in the Fl value as compared to the BERT + CRF method.[Conclusions] The proposed two-stage open information extraction method can overcome the challenge of arguments overlapping unadaptability and the difficulty of long-span entity extraction, thus improving the shortcomings of existing open information extraction methods. Extensive experimental analysis conducted on the self-built annotated dataset proved the effectiveness of the proposed method. © 2023 Press of Tsinghua University. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#204 - Hu 2023
An empirical study of pre-trained language models in simple knowledge graph question answering

Hu, N.; Wu, Y. K.; Qi, G. L.; Min, D. H.; Chen, J. Y.; Pan, J. Z.; Ali, Z.

World Wide Web 2023;26(5):2855-2886

2023

DOI: 10.1007/s11280-023-01166-y · Ref ID: 3203

Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is still a lack of comprehensive research and comparison of the performance of different PLMs in KGQA. To this end, we summarize two basic KGQA frameworks based on PLMs without additional neural network modules to compare the performance of nine PLMs in terms of accuracy and efficiency. In addition, we present three benchmarks for larger-scale KGs based on the popular SimpleQuestions benchmark to investigate the scalability of PLMs. We carefully analyze the results of all PLMs-based KGQA basic frameworks on these benchmarks and two other popular datasets, WebQuestionSP and FreebaseQA, and find that knowledge distillation techniques and knowledge enhancement methods in PLMs are promising for KGQA. Furthermore, we test ChatGPT (https://chat.openai.com/), which has drawn a great deal of attention in the NLP community, demonstrating its impressive capabilities and limitations in zero-shot KGQA. We have released the code and benchmarks to promote the use of PLMs on KGQA (https://github.com/aannonymouuss/PLMs-in-Practical-KBQA).

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3217 - Hu 2024
Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs

Hu, Nan; Chen, Jiaoyan; Wu, Yike; Qi, Guilin; Bi, Sheng; Wu, Tongtong; Pan, Jeff Z.

arXiv 2024;():

2024

Ref ID: 8049

The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations and statements. To compare these attribution evaluation methods and develop new ones, we introduce a set of fine-grained categories (i.e., supportive, insufficient, contradictory and irrelevant) for measuring the attribution, and develop a Complex Attributed Question Answering (CAQA) benchmark by leveraging knowledge graphs (KGs) for automatically generating attributions of different categories to question-answer pairs. Our analysis reveals that existing evaluators perform poorly under fine-grained attribution settings and exhibit weaknesses in complex citation-statement reasoning. Our CAQA benchmark, validated with human annotations, emerges as a promising tool for selecting and developing LLM attribution evaluators.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#2049 - Hu 2024
ZipZap: Efficient Training of Language Models for Large-Scale Fraud Detection on Blockchain

Hu, S.; Huang, T.; Chow, K. H.; Wei, W.; Wu, Y.; Liu, L.

WWW 2024 - Proceedings of the ACM Web Conference 2024;():2807-2816

Association for Computing Machinery, Inc 2024

DOI: 10.1145/3589334.3645352 · Ref ID: 4071

Language models (LMs) have demonstrated superior performance in detecting fraudulent activities on Blockchains. Nonetheless, the sheer volume of Blockchain data results in excessive memory and computational costs when training LMs from scratch, limiting their capabilities to large-scale applications. In this paper, we present ZipZap, a framework tailored to achieve both parameter and computational efficiency when training LMs on large-scale transaction data. First, with the frequency-aware compression, an LM can be compressed down to a mere 7.5% of its initial size with an imperceptible performance dip. This technique correlates the embedding dimension of an address with its occurrence frequency in the dataset, motivated by the observation that embeddings of low-frequency addresses are insufficiently trained and thus negating the need for a uniformly large dimension for knowledge representation. Second, ZipZap accelerates the speed through the asymmetric training paradigm: It performs transaction dropping and cross-layer parameter-sharing to expedite the pre-training process, while revert to the standard training paradigm for fine-tuning to strike a balance between efficiency and efficacy, motivated by the observation that the optimization goals of pre-training and fine-tuning are inconsistent. Evaluations on real-world, large-scale datasets demonstrate that ZipZap delivers notable parameter and computational efficiency improvements for training LMs. Our implementation is available at: https://github.com/git-disl/ZipZap. © 2024 Owner/Author.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#2120 - Hu 2018
Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (Extended Abstract)

Hu, S.; Zou, L.; Yu, J. X.; Wang, H.; Zhao, D.

2018 IEEE 34th International Conference on Data Engineering (ICDE) 2018;():1815-1816

2018

DOI: 10.1109/ICDE.2018.00265 · Ref ID: 6116

RDF question/answering (Q/A) allows users to ask questions in natural languages over a knowledge base represented by RDF. To answer a natural language question, the existing works focus on question understanding to deal with the disambiguation of phrases linking, which ignore the query composition and execution. In this paper, we propose a systematic framework to answer natural language questions over RDF repository (RDF Q/A) from a graph data-driven perspective. We propose the (super) semantic query graph to model the query intention in the natural language question in a structural way, based on which, RDF Q/A is reduced to subgraph matching problem. More importantly, we resolve the ambiguity both of phrases and structures at the time when matches of query are found. To build the super semantic query graph, we propose a node-first framework which has high robustness and can tackle with complex questions. Extensive experiments confirm that our method not only improves the precision but also speeds up query performance greatly.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#115 - Hu 2024
Combining ChatGPT and knowledge graph for explainable machine learning-driven design: a case study

Hu, X.; Liu, A.; Dai, Y.

J. Eng. Des. 2024;():23

2024

DOI: 10.1080/09544828.2024.2355758 · Ref ID: 3166

Machine learning has been widely used in design activities, enabling more informed decision-making. However, high-performance machine learning models, often referred to as 'black-box', result in a lack of explainability regarding predictions. The absence of explainability erodes the trust between designers and these models and hinders human-machine collaboration for desirable design decisions. Explainable AI focuses on creating explanations that are accessible and comprehensible to stakeholders, thereby improving explainability. A recent advancement in the field of explainable AI involves leveraging domain-specific knowledge via knowledge graph. Additionally, the advent of large language models like ChatGPT, acclaimed for their ability to output domain knowledge, perform complex language processing, and support seamless end-user interaction, has the potential to expand the horizons of explainable AI. Inspired by these developments, we propose the novel hybrid method that synergizes ChatGPT and knowledge graph to augment post-hoc explainability in design context. The outcome is the generation of more contextual and meaningful explanations, with the added possibility of further interaction to uncover deeper insights. The effectiveness of the proposed method is illustrated through a case study on customer segmentation.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#536 - Hu 2024
LLM-TIKG: Threat intelligence knowledge graph construction utilizing large language model

Hu, Y. L.; Zou, F. T.; Han, J. J.; Sun, X.; Wang, Y. L.

Comput. Secur. 2024;145():11

2024

DOI: 10.1016/j.cose.2024.103999 · Ref ID: 2935

Open-source threat intelligence is often unstructured and cannot be directly applied to the next detection and defense. By constructing a knowledge graph through open-source threat intelligence, we can better apply this information to intrusion detection. However, the current methods for constructing knowledge graphs face limitations due to the domain-specific attributes of entities and the analysis of lengthy texts, and they require large amounts of labeled data. Furthermore, there is a lack of authoritative open-source annotated threat intelligence datasets, which require significant manual effort. Moreover, it is noteworthy that current research often neglects the textual descriptions of attack behaviors, resulting in the loss of vital information to understand intricate cyber threats. To address these issues, we propose LLM-TIKG that applies the large language model to construct a knowledge graph from unstructured open-source threat intelligence. The few-shot learning capability of GPT is leveraged to achieve data annotation and augmentation, thereby creating the datasets for fine-tuning a smaller language model (7B). Using the fine-tuned model, we perform topic classification on the collected reports, extract entities and relationships, and extract TTPs from the attack description. This process results in the construction of a threat intelligence knowledge graph, enabling automated and universal analysis of textualized threat intelligence. The experimental results demonstrate improved performance in both named entity recognition and TTP classification, achieving the precision of 87.88% and 96.53%, respectively.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#679 - Hu 2023
PROMPTCAP: Prompt-Guided Image Captioning for VQA with GPT-3

Hu, Y. S.; Hua, H.; Yang, Z. Y.; Shi, W. J.; Smith, N. A.; Luo, J. B.; Ieee

IEEE/CVF International Conference on Computer Vision (ICCV) 2023;():2951-2963

Paris, FRANCE Ieee Computer Soc 2023

DOI: 10.1109/iccv51070.2023.00277 · Ref ID: 3770

Knowledge-based visual question answering (VQA) involves questions that require world knowledge beyond the image to yield the correct answer. Large language models (LMs) like GPT-3 are particularly helpful for this task because of their strong knowledge retrieval and reasoning capabilities. To enable LM to understand images, prior work uses a captioning model to convert images into text. However, when summarizing an image in a single caption sentence, which visual entities to describe are often underspecified. Generic image captions often miss visual details essential for the LM to answer visual questions correctly. To address this challenge, we propose PROMPTCAP (Prompt-guided image Captioning), a captioning model designed to serve as a better connector between images and black-box LMs. Different from generic captions, PROMPTCAP takes a naturallanguage prompt to control the visual entities to describe in the generated caption. The prompt contains a question that the caption should aid in answering. To avoid extra annotation, PROMPTCAP is trained by examples synthesized with GPT-3 and existing datasets. We demonstrate PROMPT-CAP's effectiveness on an existing pipeline in which GPT-3 is prompted with image captions to carry out VQA. PROMPT-CAP outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60.4% on OK-VQA and 59.6% on A-OKVQA). Zeroshot results on WebQA show that PROMPTCAP generalizes well to unseen domains.(1)

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#1053 - Hu 2022
Can Pretrained Language Models Reason on Sparse Commonsense Knowledge Graph?

Hu, Y.; Xu, W.; Liu, Q.; Wang, L.; Wu, S.

2022 IEEE 8th International Conference on Computer and Communications, ICCC 2022 2022;():2016-2022

Institute of Electrical and Electronics Engineers Inc. 2022

DOI: 10.1109/ICCC56324.2022.10065877 · Ref ID: 5317

Commonsense knowledge is the knowledge shared by most humans, which is always stored in commonsense knowledge graph (CKG) as triplets. In this paper, we focus on the task of CKG competition, whose target is to predict the tail (head) target given the head (tail) entity and the relation. Most existing works employ the graph-based models, which aggregate information from neighboring entities on CKG. Despite their effectiveness, they still suffer from two main weaknesses. Firstly, the semantic relations between head and tail entities are neglected. Secondly, due to the sparsity of CKG, they rely on the graph densification that it will bring unexpected noises. To solve these problems, we propose a unified framework for COmmonSense knowledge graph completion based on BERT, namely COS-BERT. Firstly, we transfer each triplet into a natural sentence. Then, we fine-tune the pretrained language model using the transformed sentences. Finally, we rank the candidates based on the output representation of sentences. Furthermore, we add a pre-filter to obtain a subset of candidates on the inference stage to save unnecessary computation costs. Comprehensive experiments have demonstrated the superiority of COS-BERT over the state-of-the-arts. © 2022 IEEE.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3210 - Hu 2024
Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting

Hu, Yuting; Liu, Dancheng; Wang, Qingyun; Yu, Charles; Ji, Heng; Xiong, Jinjun

arXiv 2024;():

2024

Ref ID: 8569

To address the challenge of automating knowledge discovery from a vast volume of literature, in this paper, we introduce a novel framework based on large language models (LLMs) that combines a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo, designed to enhance the automation of knowledge extraction from scientific articles. The POP algorithm utilizes a prioritized breadth-first search (BFS) across a predefined ontology to generate structured prompt templates and action orders, thereby guiding LLMs to discover knowledge in an automatic manner. Additionally, our LLM-Duo employs two specialized LLM agents: an explorer and an evaluator. These two agents work collaboratively and adversarially to enhance the reliability of the discovery and annotation processes. Experiments demonstrate that our method outperforms advanced baselines, enabling more accurate and complete annotations. To validate the effectiveness of our method in real-world scenarios, we employ our method in a case study of speech-language intervention discovery. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain. We curate these findings into a publicly accessible intervention knowledge base that holds significant potential to benefit the speech-language therapy community.

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#694 - Hu 2023
A question answering system for assembly process of wind turbines based on multi-modal knowledge graph and large language model

Hu, Z. Q.; Li, X. Y.; Pan, X. Y.; Wen, S. J.; Bao, J. S.

J. Eng. Des. 2023;():25

2023

DOI: 10.1080/09544828.2023.2272555 · Ref ID: 3082

In the field of wind power generation, wind turbines serve as the foundation for harnessing electrical energy. However, the assembly process information for wind turbines is typically dispersed among various modalities such as 3D models, natural text, and images in the form of process documents. The difficulty in effectively utilising historical process knowledge hampers the efficiency of assembly process design and subsequently affects production efficiency. To address this issue, this paper constructs a Multi-modal Process Knowledge Graph for Wind Turbines, named MPKG-WT. Additionally, a wind turbine assembly process question-answering system combining multi-modal knowledge graphs with large language models (LLMs) is proposed to enable efficient utilisation of historical assembly process knowledge. The proposed approach achieves outstanding results when compared with other state-of-the-art KBQA methods and recent LLMs using a wind turbine assembly process dataset. The effectiveness of the approach is further validated through a visualised assembly process question-answering system. The research findings demonstrate a significant improvement in assembly process design efficiency.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1226 - Hu 2024
EEE-QA: Exploring Effective and Efficient Question-Answer Representations

Hu, Z.; Yang, Y.; Xu, J.; Qiu, Y.; Chen, P.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():5520-5525

European Language Resources Association (ELRA) 2024

Ref ID: 4546

Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa. This work challenges the existing question-answer encoding convention and explores finer representations. We begin with testing various pooling methods compared to using the begin-of-sentence token as a question representation for better quality. Next, we explore opportunities to simultaneously embed all answer candidates with the question. This enables cross-reference between answer choices and improves inference throughput via reduced memory usage. Despite their simplicity and effectiveness, these methods have yet to be widely studied in current frameworks. We experiment with different PLMs, and with and without the integration of knowledge graphs. Results prove that the memory efficacy of the proposed techniques with little sacrifice in performance. Practically, our work enhances 38-100% throughput with 26-65% speedups on consumer-grade GPUs by allowing for considerably larger batch sizes. Our work sends a message to the community with promising directions in both representation quality and efficiency for the question-answering task in natural language processing. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#1559 - Huang 2024
Large Foundation Models for Power Systems

Huang, C.; Li, S.; Liu, R.; Wang, H.; Chen, Y.

IEEE Power and Energy Society General Meeting 2024;():

IEEE Computer Society 2024

DOI: 10.1109/PESGM51994.2024.10688670 · Ref ID: 4124

Foundation models, such as Large Language Models (LLMs), can respond to a wide range of format-free queries without any task-specific data collection or model training, creating various research and application opportunities for the modeling and operation of large-scale power systems. In this paper, we outline how such large foundation model such as GPT-4 are developed, and discuss how they can be leveraged in challenging power and energy system tasks. We first investigate the potential of existing foundation models by validating their performance on four representative tasks across power system domains, including the optimal power flow (OPF), electric vehicle (EV) scheduling, knowledge retrieval for power engineering technical reports, and situation awareness. Our results indicate strong capabilities of such foundation models on boosting the efficiency and reliability of power system operational pipelines. We also provide suggestions and projections on future deployment of foundation models in power system applications. © 2024 IEEE.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#482 - Huang 2023
KOSA: KO Enhanced Salary Analytics based on Knowledge Graph and LLM Capabilities

Huang, F.; Deng, Y.; Zhang, C.; Guo, M. H.; Zhan, K.; Sun, S. X.; Jiang, J. L.; Sun, Z. Y.; Wu, X. D.

23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():499-505

Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023

DOI: 10.1109/icdmw60847.2023.00071 · Ref ID: 2973

Knowledge base question answering (KBQA) is designed to respond to natural language inquiries by utilizing factual information, such as entities, relationships, and attributes, derived from a knowledge base (KB). The advent of large language models (LLMs) has significantly boosted the performance of KBQA, owing to their exceptional capabilities in content comprehension and generation. In this paper, we present a Knowledge Ocean enhanced Salary Analytics (KOSA) system based on knowledge graphs and LLMs tailored to employee salary data from a public university. This system encompasses an interactive conversational interface, visualization of knowledge graphs, and advanced data analysis. By employing the framework of knowledge engineering, we enable knowledge graph modeling, Cypher (the query engine of Neo4j) reasoning, and question answering functionalities. Furthermore, machine learning algorithms are integrated to facilitate advanced features, such as salary prediction and allocation.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3980 - Huang 2024
VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark

Huang, Han; Zhong, Haitian; Yu, Tao; Liu, Qiang; Wu, Shu; Wang, Liang; Tan, Tieniu

arXiv 2024;():

2024

Ref ID: 8177

Recently, knowledge editing on large language models (LLMs) has received considerable attention. Compared to this, editing Large Vision-Language Models (LVLMs) faces extra challenges from diverse data modalities and complicated model components, and data for LVLMs editing are limited. The existing LVLM editing benchmark, which comprises three metrics (Reliability, Locality, and Generality), falls short in the quality of synthesized evaluation images and cannot assess whether models apply edited knowledge in relevant content. Therefore, we employ more reliable data collection methods to construct a new Large $\textbf{V}$ision-$\textbf{L}$anguage Model $\textbf{K}$nowledge $\textbf{E}$diting $\textbf{B}$enchmark, $\textbf{VLKEB}$, and extend the Portability metric for more comprehensive evaluation. Leveraging a multi-modal knowledge graph, our image data are bound with knowledge entities. This can be further used to extract entity-related knowledge, which constitutes the base of editing data. We conduct experiments of different editing methods on five LVLMs, and thoroughly analyze how do they impact the models. The results reveal strengths and deficiencies of these methods and hopefully provide insights for future research. The codes and dataset are available at: $\href{https://github.com/VLKEB/VLKEB}{\text{https://github.com/VLKEB/VLKEB}}$.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#785 - Huang 2024
SSNF: Optimizing Entity Alignment with a Novel Structural and Semantic Neighbor Filtering

Huang, J. B.

17th International Conference on Knowledge Science, Engineering and Management (KSEM) 2024;14885():180-191

Birmingham, ENGLAND Springer-Verlag Singapore Pte Ltd 2024

DOI: 10.1007/978-981-97-5495-3_13 · Ref ID: 3393

In the domain of Knowledge Graphs (KGs), the alignment of entities is pivotal, aiming to identify and match equivalent entities across distinct KGs. Existing methodologies primarily aggregate information from direct neighbors via graph neural networks, a process which can inadvertently introduce noise. To address this challenge, we introduce SSNF, an innovative neighbor filtering mechanism that optimally balances structural and semantic information, crucial for accurate entity alignment. By employing motifs for structural assessment and leveraging Large Language Models (LLMs) for semantic analysis with 'Reasoning-Challenging (Re-Cha)' strategy to query LLMs to determine important neighbors. This dual-focus strategy mitigates the inclusion of less informative neighbors. When integrated with existing Entity Alignment (EA) frameworks, our approach demonstrates superior efficacy, significantly outperforming conventional methods through meticulous neighbor selection. The extensive experiments, conducted on the most widely used benchmark datasets (i.e., DBP15K), exhibit a significant improvement in EA performance, demonstrating its potential to advance the field of KG entity alignment by synergizing structural insights and semantic precision.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1827 - Huang 2020
Review of Deep Learning-Based Topic Model

Huang, J. J.; Li, P. W.; Peng, M.; Xie, Q. Q.; Xu, C.

Jisuanji Xuebao 2020;43(5):827-855

2020

DOI: 10.11897/SP.J.1016.2020.00827 · Ref ID: 5703

As a research hotspot for more than twenty years, topic model plays an important role in semantic analysis of multi-documents.The topic model is adept in extracting groups of keywords from documents to represent their core idea,and thus provides crucial support for document classification, information retrieval, automatic summarization of multi-documents, sentiment analysis and so on. Conventional topic model based on three-layers Bayesian network has been well studied in the past ten years. However, combining with deep learning techniques makes topic model grow new lease of life in recent years due to the wide applications of deep learning in natural language processing, such as word embeddings training, text generation and knowledge graph building. In deep-learning-based topic models, it has become a major task of designing a more accurate and effective model by introducing advanced ideas and techniques from deep learning, such as word embeddings, neural network (e.g., recurrent neural network, RNN), variational auto-encoder (VAE) and knowledge graph. In this review, we first comparatively discuss four probabilistic topic models and two sparse additive topic models from model assumption, document generation process and parameter inference. There are latent Dirichlet allocation (LDA), Dirichlet multinomial mixture model (DMM), biterm topic model (BTM), sparse topical coding (STC) and sparse additive generative model (SAGM) respectively. The above six models are the typical representations of the conventional topic model and have various improvement versions and applications since they have been proposed. Then, we introduce the latest research progress of deep-learning-based topic models in detail, which can be summed up as three different types of models. The first type of the model is named word-embedding-based probabilistic topic model, which improves one of the conventional topics model (e.g., LDA, DMM or BTM) with auxiliary pre-trained word embeddings while still complying with the basic assumption of the original model. In these models, word embeddings that pre-trained from large volume of corpus like Wikipedia are introduced to evaluate the similarity between word pair. Based on the evaluation, similar words are more likely to be assigned to the same topic during topic sampling process, and thus the topic coherence and text classification accuracy are improved eventually. The second type of the model is named neural-network-based topic model, which employs neural network structure, such as Multilayer Perceptron (MLP) or RNN, to model the document generation process with introducing latent topic structure. In these models, bag-of-words of a text is feeded into neural topic model and transferred into embeddings, then topic distribution and topic-word distribution are inferred out by the neural network. To further improve the performance of the neural topic model, VAE is employed to transfer the text embeddings into latent space before topic inference process, and sparsity constraint of topic-word distribution is enforced into the model to generate more expressive topical words. The third type of the model is named jointly training model of topic and language, which can train a topic model and language model simultaneously. In these models, token sequence of a text is feeded into a neural network to generate text with the guidance of latent topics. Furthermore, we summarize the public datasets (e.g., 20NewsGroups) and evaluation metrics (e.g., Pointwise Mutual Information) used in above topic models. Finally, we end up with discussing some potential trends of topic model's future development. © 2020, Science Press. All right reserved.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#882 - Huang 2023
WERECE: An Unsupervised Method for Educational Concept Extraction Based on Word Embedding Refinement

Huang, J. X.; Ding, R. F.; Wu, X. M.; Chen, S. M.; Zhang, J. L.; Liu, L. X.; Zheng, Y. X.

Appl. Sci.-Basel 2023;13(22):20

2023

DOI: 10.3390/app132212307 · Ref ID: 3605

The era of educational big data has sparked growing interest in extracting and organizing educational concepts from massive amounts of information. Outcomes are of the utmost importance for artificial intelligence-empowered teaching and learning. Unsupervised educational concept extraction methods based on pre-trained models continue to proliferate due to ongoing advances in semantic representation. However, it remains challenging to directly apply pre-trained large language models to extract educational concepts; pre-trained models are built on extensive corpora and do not necessarily cover all subject-specific concepts. To address this gap, we propose a novel unsupervised method for educational concept extraction based on word embedding refinement (i.e., word embedding refinement-based educational concept extraction (WERECE)). It integrates a manifold learning algorithm to adapt a pre-trained model for extracting educational concepts while accounting for the geometric information in semantic computation. We further devise a discriminant function based on semantic clustering and Box-Cox transformation to enhance WERECE's accuracy and reliability. We evaluate its performance on two newly constructed datasets, EDU-DT and EDUTECH-DT. Experimental results show that WERECE achieves an average precision up to 85.9%, recall up to 87.0%, and F1 scores up to 86.4%, which significantly outperforms baselines (TextRank, term frequency-inverse document frequency, isolation forest, K-means, and one-class support vector machine) on educational concept extraction. Notably, when WERECE is implemented with different parameter settings, its precision and recall sensitivity remain robust. WERECE also holds broad application prospects as a foundational technology, such as for building discipline-oriented knowledge graphs, enhancing learning assessment and feedback, predicting learning interests, and recommending learning resources.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3850 - Huang 2024
RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs

Huang, Jiatan; Li, Mingchen; Yao, Zonghai; Yang, Zhichao; Xiao, Yongkang; Ouyang, Feiyun; Li, Xiaohan; Han, Shuo; Yu, Hong

arXiv 2024;():

2024

Ref ID: 8725

Answering complex real-world questions often requires accurate retrieval from textual knowledge graphs (TKGs). The scarcity of annotated data, along with intricate topological structures, makes this task particularly challenging. As the nature of relational path information could enhance the inference ability of Large Language Models (LLMs), efficiently retrieving more complex relational path information from TKGs presents another key challenge. To tackle these challenges, we first develop a Dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage.We synthesize realistic user queries that integrate diverse topological structures, relational information, and complex textual descriptions. We conduct rigorous expert evaluation to validate the quality of our synthesized queries. And then, we introduce an enhanced Monte Carlo Tree Search (MCTS) method, Relational MCTS, to automatically extract relational path information from textual graphs for specific queries. Our dataset mainly covers the medical domain as the relation types and entity are complex and publicly available. Experimental results indicate that RiTeK poses significant challenges for current retrieval and LLM systems, while the proposed Relational MCTS method enhances LLM inference ability and achieves state-of-the-art performance on RiTeK.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1513 - Huang 2020
Knowledge graph-augmented abstractive summarization with semantic-driven cloze reward

Huang, L.; Wu, L.; Wang, L.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2020;():5094-5107

Association for Computational Linguistics (ACL) 2020

DOI: 10.18653/v1/2020.acl-main.457 · Ref ID: 5777

Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive. We argue that, to address these issues, the summarizer should acquire semantic interpretation over input, e.g., via structured representation, to allow the generation of more informative summaries. In this paper, we present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD. We propose the use of dual encoders-a sequential document encoder and a graph-structured encoder-to maintain the global context and local characteristics of entities, complementing each other. We further design a reward based on a multiple choice cloze test to drive the model to better capture entity interactions. Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets. We also obtain better or comparable performance compared to systems that are fine-tuned from large pretrained language models. Human judges further rate our model outputs as more informative and containing fewer unfaithful errors. © 2020 Association for Computational Linguistics

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#671 - Huang 2023
PRODIGY: Enabling In-context Learning Over Graphs

Huang, Q.; Ren, H. Y.; Chen, P.; Krzmanc, G.; Zeng, D.; Liang, P.; Leskovec, J.

37th Conference on Neural Information Processing Systems (NeurIPS) 2023;():

New Orleans, LA Neural Information Processing Systems (Nips) 2023

Ref ID: 3504

In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks by conditioning on prompt examples, without optimizing any parameters. While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop Pretraining Over Diverse In-Context Graph Systems (PRODIGY), the first pretraining framework that enables in-context learning over graphs. The key idea of our framework is to formulate in-context learning over graphs with a novel prompt graph representation, which connects prompt examples and queries. We then propose a graph neural network architecture over the prompt graph and a corresponding family of in-context pretraining objectives. With PRODIGY, the pretrained model can directly perform novel downstream classification tasks on unseen graphs via in-context learning. We provide empirical evidence of the effectiveness of our framework by showcasing its strong in-context learning performance on tasks involving citation networks and knowledge graphs. Our approach outperforms the in-context learning accuracy of contrastive pretraining baselines with hard-coded adaptation by 18% on average across all setups. Moreover, it also outperforms standard finetuning with limited data by 33% on average with in-context learning.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1998 - Huang 2025
Two Semantic Information Extension Enhancement Methods For Zero-Shot Learning

Huang, W.; Ju, X.; Zhou, Y.; Xu, Y.; Yang, G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2025;15035 LNCS():511-525

Springer Science and Business Media Deutschland GmbH 2025

DOI: 10.1007/978-981-97-8620-6_35 · Ref ID: 3804

In the domain of computer vision, Zero-Shot Learning (ZSL) achieves the classification of unseen class objects through the utilization of semantic information of class relationships. Acquiring richer semantic information and representation pose a significant avenue for enhancing learner performance. Existing studies of ZSL predominately address this challenge only by introducing knowledge graphs and graph neural networks, overlooking inadequacies in the original semantic information, the intrinsic hierarchical and the directional characteristics within the graph structure. This paper proposes two semantic information enhancing methods for ZSL, respectively tailored for regular datasets and large-scale datasets. Facing regular ZSL datasets, our method leverages textual knowledge within large language models, extending traditional 2-dimensional attribute annotations to a 3-dimensional space to obtain more comprehensive class-level semantic information. Addressing the large ZSL tasks, our approach combines enhanced semantic information with external knowledge graphs to simulate class relationships, employing the intrinsic structure and directionality of graphs to bolster semantic representations. We validated our approaches on four traditional ZSL datasets and the ImageNet dataset. The experimental results manifested significant improvements in ZSL performance, underscoring the potential of our methods. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1177 - Huang 2024
Designing an Interpretable Question Answering System for Vertical Domains Based on Large Language Model and Knowledge Graph

Huang, X.; Ku, C. S.

Advances in Transdisciplinary Engineering 2024;57():552-561

IOS Press BV 2024

DOI: 10.3233/ATDE240503 · Ref ID: 3988

Given the low interpretability of large language models (LLMs) due to their extensive parameters and intricate features, this study aims to enhance the understandability and interpretability of automatic QA systems powered by LLMs, thereby addressing a critical gap in the field. To achieve this, we introduce an interpretable architecture for a domain-specific LLM-based question-answering (QA) system. The research decomposes the QA system into six modules: operation recognition, intent recognition, normalization, triplet structured data conversion, knowledge graph querying, and query result processing. Through this approach, the input and output of each module in the QA system are human-readable text data, enhancing the interpretability of the QA system's processing. The use of knowledge graph data increases the credibility of the answers provided by the QA system. The QA system architecture proposed in this study attempts to integrate the powerful natural language understanding capabilities of large language models with the data querying capacity of knowledge graphs, offering a reference for addressing the issue of low interpretability in automatic QA systems based on large language models (LLMs). © 2024 The Authors.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3985 - Huang 2024
WESE: Weak Exploration to Strong Exploitation for LLM Agents

Huang, Xu; Liu, Weiwen; Chen, Xiaolong; Wang, Xingmei; Lian, Defu; Wang, Yasheng; Tang, Ruiming; Chen, Enhong

arXiv 2024;():

2024

Ref ID: 8226

Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive environments, these methods exhibit limitations. Firstly, the lack of global information of environments leads to greedy decisions, resulting in sub-optimal solutions. On the other hand, irrelevant information acquired from the environment not only adversely introduces noise, but also incurs additional cost. This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE), to enhance LLM agents in solving open-world interactive tasks. Concretely, WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task. Our approach is flexible enough to incorporate diverse tasks, and obtains significant improvements in both success rates and efficiency across four interactive benchmarks.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2852 - Huang 2010
Research on Representation of Geographic Feature Based on Geo-Ontology

Huang, Y.; Deng, G.; Wu, X.; Zhao, Z.

2010 2nd International Workshop on Intelligent Systems and Applications 2010;():1-5

2010

DOI: 10.1109/IWISA.2010.5473529 · Ref ID: 6310

In future, GIS will develop in the direction of popular application. Geography information sharing has become a stringent problem needed to resolve. Because people's cognition of real geographic world is different from each other, semantic discrepancy comes into being. As a result, Realization of geography information sharing has firstly to realize GIS's semantic sharing. Geo-ontology provides generally accepted concepts in geographic information domain and explicit formalized definitions thereof, so as to resolve the problem different geographic cognition gives rise to and inter-translation problem of description logic thereof. Geo-ontology can be applied to integration and sharing of geographic information. Start with the process of abstraction and generalization from real geographic world to computer world, this paper analyzes the source of semantic heterogeneity, and introduces geographic feature and its representing generalized geographic objects. Then this paper analyzes relationships between geo-ontology and geographic feature in detail, finally lay emphasis upon research on how to use geo-ontology to represent geographic features.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#596 - Huang 2023
MVP-Tuning: Multi-View Knowledge Retrieval with Prompt Tuning for Commonsense Reasoning

Huang, Y. F.; Li, Y. Y.; Xu, Y. C.; Zhang, L.; Gan, R. Y.; Zhang, J. X.; Wang, L. W.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():13417-13432

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3753

Recent advances in pre-trained language models (PLMs) have facilitated the development of commonsense reasoning tasks. However, existing methods rely on multi-hop knowledge retrieval and thus suffer low accuracy due to embedded noise in the acquired knowledge. In addition, these methods often attain high computational costs and nontrivial knowledge loss because they encode the knowledge independently of the PLM, making it less relevant to the task and resulting in a poor local optimum. In this work, we propose Multi-View Knowledge Retrieval with Prompt Tuning (MVP-Tuning). Our MVP-Tuning leverages similar question-answer pairs in training set to improve knowledge retrieval and employs a single prompttuned PLM to model knowledge and input text jointly. We conduct our experiments on five commonsense reasoning QA benchmarks to show that MVP-Tuning outperforms all other baselines in 4 out of 5 datasets with only as most 2% trainable parameters. The ensemble of our MVP-Tuning models even gets a new state-of-the-art performance on OpenBookQA and is ranked first place on the leaderboard(1). Our code and data are available(2).

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3805 - Hussien 2024
RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models

Hussien, Mohamed Manzour; Melo, Angie Nataly; Ballardini, Augusto Luis; Maldonado, Carlota Salinas; Izquierdo, Rubén; Sotelo, Miguel Ángel

arXiv 2024;():

2024

Ref ID: 8269

Prediction of road users' behaviors in the context of autonomous driving has gained considerable attention by the scientific community in the last years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of the reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of research works rely on powerful Deep Learning techniques, which exhibit high performance metrics in prediction tasks but may lack the ability to fully understand and exploit the contextual semantic information contained in the road scene, not to mention their inability to provide explainable predictions that can be understood by humans. In this work, we propose an explainable road users' behavior prediction system that integrates the reasoning abilities of Knowledge Graphs (KG) and the expressiveness capabilities of Large Language Models (LLM) by using Retrieval Augmented Generation (RAG) techniques. For that purpose, Knowledge Graph Embeddings (KGE) and Bayesian inference are combined to allow the deployment of a fully inductive reasoning system that enables the issuing of predictions that rely on legacy information contained in the graph as well as on current evidence gathered in real time by onboard sensors. Two use cases have been implemented following the proposed approach: 1) Prediction of pedestrians' crossing actions; 2) Prediction of lane change maneuvers. In both cases, the performance attained surpasses the current state of the art in terms of anticipation and F1-score, showing a promising avenue for future research in this field.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1101 - Hwang 2021
(COMET-)ATOMIC2020: On Symbolic and Neural Commonsense Knowledge Graphs

Hwang, J. D.; Bhagavatula, C.; Le Bras, R.; Da, J.; Sakaguchi, K.; Bosselut, A.; Choi, Y.

35th AAAI Conference on Artificial Intelligence, AAAI 2021 2021;7():6384-6392

Association for the Advancement of Artificial Intelligence 2021

Ref ID: 5655

Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge. In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them. With this new goal, we propose ATOMIC2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ∼12 absolute points lower than a BART-based knowledge model trained on ATOMIC2020 despite using over 430x fewer parameters. Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3945 - Ibáñez 2023
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination

Ibáñez, Luis-Daniel; Domingue, John; Kirrane, Sabrina; Seneviratne, Oshani; Third, Aisling; Vidal, Maria-Esther

arXiv 2023;():

2023

Ref ID: 7913

Knowledge Graphs (KGs) have emerged as fundamental platforms for powering intelligent decision-making and a wide range of Artificial Intelligence (AI) services across major corporations such as Google, Walmart, and AirBnb. KGs complement Machine Learning (ML) algorithms by providing data context and semantics, thereby enabling further inference and question-answering capabilities. The integration of KGs with neuronal learning (e.g., Large Language Models (LLMs)) is currently a topic of active research, commonly named neuro-symbolic AI. Despite the numerous benefits that can be accomplished with KG-based AI, its growing ubiquity within online services may result in the loss of self-determination for citizens as a fundamental societal issue. The more we rely on these technologies, which are often centralised, the less citizens will be able to determine their own destinies. To counter this threat, AI regulation, such as the European Union (EU) AI Act, is being proposed in certain regions. The regulation sets what technologists need to do, leading to questions concerning: How can the output of AI systems be trusted? What is needed to ensure that the data fuelling and the inner workings of these artefacts are transparent? How can AI be made accountable for its decision-making? This paper conceptualises the foundational topics and research pillars to support KG-based AI for self-determination. Drawing upon this conceptual framework, challenges and opportunities for citizen self-determination are illustrated and analysed in a real-world scenario. As a result, we propose a research agenda aimed at accomplishing the recommended objectives.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3218 - Ifergan 2024
Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs

Ifergan, Maxim; Choshen, Leshem; Aharoni, Roee; Szpektor, Idan; Abend, Omri

arXiv 2024;():

2024

Ref ID: 8548

The veracity of a factoid is largely independent of the language it is written in. However, language models are inconsistent in their ability to answer the same factual question across languages. This raises questions about how LLMs represent a given fact across languages. We explore multilingual factual knowledge through two aspects: the model's ability to answer a query consistently across languages, and the ability to ''store'' answers in a shared representation for several languages. We propose a methodology to measure the extent of representation sharing across languages by repurposing knowledge editing methods. We examine LLMs with various multilingual configurations using a new multilingual dataset. We reveal that high consistency does not necessarily imply shared representation, particularly for languages with different scripts. Moreover, we find that script similarity is a dominant factor in representation sharing. Finally, we observe that if LLMs could fully share knowledge across languages, their accuracy in their best-performing language could benefit an increase of up to 150% on average. These findings highlight the need for improved multilingual knowledge representation in LLMs and suggest a path for the development of more robust and consistent multilingual LLMs.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1423 - Iga 2024
Integrating LLMs with Knowledge Graphs-enhanced Task-Oriented Dialogue Systems

Iga, V. I. R.

CEUR Workshop Proceedings 2024;3767():40-51

CEUR-WS 2024

Ref ID: 4117

Large Language Models (LLM) have become the state-of-the-art natural language processing systems. Their emergent abilities paved the way for dialogue systems capable of understanding and solving users’ specific tasks, ranging from arithmetic problems to simple chatting, all expressed in natural language. However, for specific domains, research has shown that LLMs cannot directly substitute Task-Oriented Dialogue Systems (TOD). TOD Systems aims to master a specific domain or company, enabling communication by natural language. Thus, this research project focuses on building personalized TODS with the help of artificial intelligence, using LLMs grounded with Temporal Knowledge Graphs. We assess the temporal validity of facts in the KG through temporal timestamps. To capture the dynamics of a company or domain, business processes are modeled with BPMN, offering the possibility of converting them to KGs. Finally, the TOD System will be able to grow a domain-specific KG and reason over it, leveraging LLMs capabilities of solving KG-related tasks. © 2023 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#539 - Iga 2024
LLMs for Knowledge-Graphs Enhanced Task-Oriented Dialogue Systems: Challenges and Opportunities

Iga, V. I. R.; Silaghi, G. C.

36th International Conference on Advanced Information Systems Engineering (CAiSE) 2024;521():168-179

Limassol, CYPRUS Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-61003-5_15 · Ref ID: 2981

Large Language Models are a great tool for solving diverse tasks formulated in natural language. Recent work has demonstrated their capacity of solving tasks related to Knowledge Graphs, such as Knowledge Graph Completion or Knowledge Graph Reasoning, even in Zero- or Few-Shot paradigms. However, given a particular input, they do not always produce the same output, and sometimes point to intermediate reasoning steps that are not valid, even if they produce a satisfactorily answer. Moreover, the use of LLMs is mostly studied for static knowledge graphs, while temporal ones are overlooked. To highlight opportunities and challenges in knowledge graph related tasks, we experiment with ChatGPT on graph completion and reasoning for both static and temporal facets, using three different prompting techniques in Zero- and One-Shot contexts, on a Task-Oriented Dialogue system use case. Our results show that ChatGPT can solve given tasks, but mostly in a nondeterministic way.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#151 - Ilievski 2021
CSKG: The CommonSense Knowledge Graph

Ilievski, F.; Szekely, P.; Zhang, B.

18th Extended Semantic Web Conference (ESWC) 2021;12731():680-696

Electr Network Springer International Publishing Ag 2021

DOI: 10.1007/978-3-030-77385-4_41 · Ref ID: 2994

Sources of commonsense knowledge support applications in natural language understanding, computer vision, and knowledge graphs. Given their complementarity, their integration is desired. Yet, their different foci, modeling approaches, and sparse overlap make integration difficult. In this paper, we consolidate commonsense knowledge by following five principles, which we apply to combine seven key sources into a first integrated CommonSense Knowledge Graph (CSKG). We analyze CSKG and its various text and graph embeddings, showing that CSKG is well-connected and that its embeddings provide a useful entry point to the graph. We demonstrate how CSKG can provide evidence for generalizable downstream reasoning and for pre-training of language models. CSKG and all its embeddings are made publicly available to support further research on commonsense knowledge integration and reasoning.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1595 - Incitti 2024
Leveraging LLMs for Knowledge Engineering from Technical Manuals: A Case Study in the Medical Prosthesis Manufacturing Domain

Incitti, F.; Salfinger, A.; Snidaro, L.; Challapalli, S.

FUSION 2024 - 27th International Conference on Information Fusion 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.23919/FUSION59988.2024.10706469 · Ref ID: 4198

Ontologies are nowadays widely used to organize information across specific domains, being effective due to their hierarchical structure and the ability to explicitly represent relationships between concepts. Knowledge engineering, like compiling companies' vast bodies of knowledge into these structures, however, still represents a time-consuming, largely manually performed process, esp. with significant amounts of knowledge often only recorded within unstructured text documents. Since the recently introduced Large Language Models (LLMs) excel on text summarization, this raises the question whether these could be exploited within dedicated knowledge fusion architectures to assist human knowledge engineers by automatically suggesting relevant classes, instances and relations extracted from textual corpora. We therefore propose a novel approach that leverages the taxonomic structure of a partially defined ontology to prompt LLMs for hierarchical knowledge organization. Unlike conventional methods that rely solely on static ontologies, our methodology dynamically generates prompts based on the ontology's existing class taxonomy, prompting the LLM to generate responses that extract supplementary information from unstructured documents. It thus introduces the concept of using ontologies as scaffolds for guiding LLMs, in order to realize a mutual interplay between structured ontological knowledge and the soft fusion capabilities of LLMs. We evaluate our proposed algorithm on a real-world case study, performing a knowledge fusion task on heterogeneous technical documentation from a medical prosthesis manufacturer. © 2024 ISIF.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#526 - Islakoglu 2024
Leveraging Pre-trained Language Models for Time Interval Prediction in Text-Enhanced Temporal Knowledge Graphs

Islakoglu, D. S.; Chekol, M. W.; Velegrakis, Y.

21st International Conference on The Semantic Web (ESWC) 2024;14664():59-78

Hersonissos, GREECE Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-60626-7_4 · Ref ID: 3119

Most knowledge graph completion (KGC) methods rely solely on structural information, even though a large number of publicly available KGs contain additional temporal (validity time intervals) and textual data (entity descriptions). While recent temporal KGC methods utilize time information to enhance link prediction, they do not leverage textual descriptions or support inductive inference (prediction for entities that have not been seen during training). In this work, we propose a novel framework called TEMT that exploits the power of pre-trained language models (PLMs) for temporal KGC. TEMT predicts time intervals of facts by fusing their textual and temporal information. It also supports inductive inference by utilizing PLMs. In order to showcase the power of TEMT, we carry out several experiments including time interval prediction, both in transductive and inductive settings, and triple classification. The experimental results demonstrate that TEMT is competitive with the state-of-the-art, while also supporting inductiveness.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3675 - Israelsen 2023
LLMs for Multi-Modal Knowledge Extraction and Analysis in Intelligence/Safety-Critical Applications

Israelsen, Brett; Sarkar, Soumalya

arXiv 2023;():

2023

Ref ID: 7973

Large Language Models have seen rapid progress in capability in recent years; this progress has been accelerating and their capabilities, measured by various benchmarks, are beginning to approach those of humans. There is a strong demand to use such models in a wide variety of applications but, due to unresolved vulnerabilities and limitations, great care needs to be used before applying them to intelligence and safety-critical applications. This paper reviews recent literature related to LLM assessment and vulnerabilities to synthesize the current research landscape and to help understand what advances are most critical to enable use of of these technologies in intelligence and safety-critical applications. The vulnerabilities are broken down into ten high-level categories and overlaid onto a high-level life cycle of an LLM. Some general categories of mitigations are reviewed.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1730 - Izquierdo-Badiola 2024
PlanCollabNL: Leveraging Large Language Models for Adaptive Plan Generation in Human-Robot Collaboration

Izquierdo-Badiola, S.; Canal, G.; Rizzo, C.; Alenya, G.

Proceedings - IEEE International Conference on Robotics and Automation 2024;():17344-17350

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICRA57147.2024.10610055 · Ref ID: 4665

"Hey, robot. Let's tidy up the kitchen. By the way, I have back pain today". How can a robotic system devise a shared plan with an appropriate task allocation from this abstract goal and agent condition? Classical AI task planning has been explored for this purpose, but it involves a tedious definition of an inflexible planning problem. Large Language Models (LLMs) have shown promising generalisation capabilities in robotics decision-making through knowledge extraction from Natural Language (NL). However, the translation of NL information into constrained robotics domains remains a challenge. In this paper, we use LLMs as translators between NL information and a structured AI task planning problem, targeting human-robot collaborative plans. The LLM generates information that is encoded in the planning problem, including specific subgoals derived from an NL abstract goal, as well as recommendations for subgoal allocation based on NL agent conditions. The framework, PlanCollabNL, is evaluated for a number of goals and agent conditions, and the results show that correct and executable plans are found in most cases. With this framework, we intend to add flexibility and generalisation to HRC plan generation, eliminating the need for a manual and laborious definition of restricted planning problems and agent models. © 2024 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3540 - Jain 2024
Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering

Jain, Parag; Lapata, Mirella

arXiv 2024;():

2024

Ref ID: 8455

We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes. Our method utilizes a graph structured representation to aggregate information about a question and its context (i.e., the conversation so far and evidence retrieved to find an answer), while also harnessing the reasoning and text generation capabilities of large language models (LLMs). Graph embeddings are directly injected into the LLM, bypassing the token embedding layers, and learned end-to-end by minimizing cross-entropy. Our model maintains a memory module to track and update past evidence, thus influencing the graph's structure, as the conversation evolves. Experimental results on the ConvMix benchmark(Christmann et al., 2022a) show that graph embeddings enhance the LLM's ability to reason, while the memory module provides robustness against noise and retrieval errors.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2468 - Jamal 2015
Formalizing air traffic control system using agent-based Mobile Petri Nets

Jamal, M.; Zafar, N. A.

2015 International Conference on Information and Communication Technologies (ICICT) 2015;():1-6

2015

DOI: 10.1109/ICICT.2015.7469480 · Ref ID: 6812

Agent-based Mobile Petri Net (MPN) is an emerging variant of classical Petri Nets which allows graphical representation of system to be developed. In addition agent-based MPN integrates mobile agent technology for modeling concurrency and mobility. Unified Modeling Language (UML) has become a defacto standard for modeling any real world system. Unlike UML models, MPN are based on mathematical semantics and can be verified for presence of errors and inconsistencies. This paper demonstrates the strength of agent-based MPN to model and verify Air Traffic Control (ATC) which is a complex, highly distributed and safety critical system. Firstly the abstract model of ATC system is introduced by identifying mobile agents like aircraft and controller then the abstract ATC model is transformed into formal ATC model. The three major operations of Takeoff, enroute and landing have been formalized using agent-based MPN. Finally the reachability analysis has been used to verify formal ATC model.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#440 - Jamil 2024
Knowledge Graph Enhancement for Improved Natural Language Health Question Answering using Large Language Models

Jamil, H. M.; Oduro-Afriyie, J.

36th International Conference on Scientific and Statistical Database Management (SSDBM) 2024;():

Rennes Univ, Inria Centre, Rennes, FRANCE Assoc Computing Machinery 2024

DOI: 10.1145/3676288.3676289 · Ref ID: 3461

In this paper we present a method for enhancing Question Answering (QA) systems by iteratively improving Knowledge Graphs (KGs) with a focus on maintaining monotonicity in the enhancement process. We introduce a mathematical framework employing functions tau and phi, where tau transforms text phi into a KG K, and phi generates an answer from T for a given question. We propose that augmenting K with domain-specific information, denoted as Delta(K), leads to a more accurate approximation of the expected answer, adhering to the principle that each enhancement either maintains or improves answer quality. This concept is formalized as phi(-1) (phi (T)boolean OR Delta(K)) yielding better results than phi(-1) (phi (T)). The paper elaborates on this process with practical examples, demonstrating how KG enhancements, under the constraints of monotonicity, lead to successive improvements in the Question Answering (QA) system.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#292 - Janatian 2023
From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems

Janatian, S.; Westermann, H.; Tan, J. Z.; Savelka, J.; Benyekhlef, K.

36th Annual International Conference on Legal Knowledge and Information Systems (JURIX) 2023;379():167-176

Maastricht Univ, Maastricht, NETHERLANDS Ios Press 2023

DOI: 10.3233/faia230962 · Ref ID: 3645

Encoding legislative text in a formal representation is an important prerequisite to different tasks in the field of AI & Law. For example, rule-based expert systems focused on legislation can support laypeople in understanding how legislation applies to them and provide them with helpful context and information. However, the process of analyzing legislation and other sources to encode it in the desired formal representation can be time-consuming and represents a bottleneck in the development of such systems. Here, we investigate to what degree large language models (LLMs), such as GPT-4, are able to automatically extract structured representations from legislation. We use LLMs to create pathways from legislation, according to the JusticeBot methodology for legal decision support systems, evaluate the pathways and compare them to manually created pathways. The results are promising, with 60% of generated pathways being rated as equivalent or better than manually created ones in a blind comparison. The approach suggests a promising path to leverage the capabilities of LLMs to ease the costly development of systems based on symbolic approaches that are transparent and explainable.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1809 - Ji 2024
Research on Knowledge Injection Method for Large Language Model Oriented to Process Specification Texts

Ji, G.; Wang, P.; Yu, Z.

J. Frontier. Comput. Sci. Technol. 2024;18(9):2361-2369

2024

DOI: 10.3778/j.issn.1673-9418.2406067 · Ref ID: 3806

The application of large language models in process specifications is an effective approach to addressing the issue of inaccurate process knowledge queries. At present, the domain model construction methods through domain knowledge graph embedding or fine-tuning with instruction data are not effective. The difficulty lies in the fact that the process knowledge in the process specifications involves relationships between multiple process elements, which is highly complex. The data are sparse because the standards are only used through citation. The high complexity of process knowledge and sparse data limit the model’s ability to learn process domain concepts, the relationships between concepts and attributes, the relationships between concepts, the relationships between multiple concepts, and reference- based knowledge. To address this difficulty, this paper proposes a large language model knowledge injection method for process specification texts. According to the characteristics of process specification data, this paper designs knowledge injection data including auxiliary sentence identification task, concept-chapter generation task, chapter continuation task and chapter-summary generation task. The model is fine-tuned through supervised learning by combining question-answer pair data to inject domain concepts, attributes, relationships between multiple concepts, and reference knowledge into the model. Experimental results show that the model trained with knowledge injection data and question-answer pair data improves ACC (accuracy) by 7.3 percentage points, ROUGE-L by 7.4 percentage points, and BLEU-4 by 6.2 percentage points compared with the model trained only with question-answer pair data, indicating the effectiveness of the proposed knowledge injection method. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

Ishan voted
Kwesi voted
Final decision
What was the agreed final decision?

#1988 - Ji 2022
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing

Ji, R.; Ji, J.

Proceedings - International Conference on Computational Linguistics, COLING 2022;29():3164-3174

Association for Computational Linguistics (ACL) 2022

Ref ID: 5347

Semantic parsing considers the task of mapping a natural language sentence into a target formal representation, where various sophisticated sequence-to-sequence (seq2seq) models have been applied with promising results. Generally, these target representations follow a syntax formalism that limits permitted forms. However, it is neither easy nor flexible to explicitly integrate this syntax formalism into a neural seq2seq model. In this paper, we present a structure-aware self-attention language model to capture structural information of target representations and propose a knowledge distillation based approach to incorporating the target language model into a seq2seq model, where grammar rules or sketches are not required in the training process. An ablation study shows that the proposed language model can notably improve the performance of the baseline model. The experiments show that our method achieves new state-of-the-art performance among neural approaches on four semantic parsing (ATIS, GEO) and Python code generation (Django, CoNaLa) tasks. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#80 - Ji 2021
C-CLUE: A Benchmark of Classical Chinese Based on a Crowdsourcing System for Knowledge Graph Construction

Ji, Z. J.; Shen, Y. X.; Sun, Y. N.; Yu, T.; Wang, X.

6th China Conference on Knowledge Graph and Semantic Computing (CCKS) 2021;1466():295-301

Guangzhou, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2021

DOI: 10.1007/978-981-16-6471-7_24 · Ref ID: 3286

Knowledge Graph Construction (KGC) aims to organize and visualize knowledge, which is based on tasks of Named Entity Recognition (NER) and Relation Extraction (RE). However, the difficulty of comprehension, caused by the differences in grammars and semantics between classical and modern Chinese, makes entity and relation annotations time-consuming and labour-intensive in classical Chinese corpus. In this paper, we design a novel crowdsourcing annotation system, which can gather collective intelligence as well as utilize domain knowledge to achieve efficient annotation and obtain finegrained datasets with high quality. More specifically, we judge the user professionalism, calculated by online tests, considered in annotation results integration and rewards assignment, which plays a vital role in improving the accuracy of annotation. Moreover, we evaluate several pre-training language models, the state-of-the-art methods in Natural Language Processing (NLP), on the benchmark datasets obtained by the system over tasks of NER and RE. Benchmark datasets, implementation details, and evaluation processes are available at https://github.com/jizijing/C-CLUE. The accessURLof the crowdsourcing annotation system is: http://152.136.45.252:60002/pages/login.html.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1972 - Ji 2023
Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Ji, Z.; Yu, T.; Xu, Y.; Lee, N.; Ishii, E.; Fung, P.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():1827-1843

Association for Computational Linguistics (ACL) 2023

Ref ID: 5046

Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks. However, the practical deployment still faces challenges, notably the issue of "hallucination", where models generate plausible-sounding but unfaithful or nonsensical information. This issue becomes particularly critical in the medical domain due to the uncommon professional concepts and potential social risks involved. This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets. Our investigation centers on the identification and comprehension of common problematic answers, with a specific emphasis on hallucination. To tackle this challenge, we present an interactive self-reflection methodology that incorporates knowledge acquisition and answer generation. Through this feedback process, our approach steadily enhances the factuality, consistency, and entailment of the generated answers. Consequently, we harness the interactivity and multitasking ability of LLMs and produce progressively more precise and accurate answers. Experimental results on both automatic and human evaluation demonstrate the superiority of our approach in hallucination reduction compared to baselines. © 2023 Association for Computational Linguistics.

Davis voted
Ishan voted
Final decision
What was the agreed final decision?

#3653 - Jia 2024
Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

Jia, Runsong; Zhang, Bowen; Méndez, Sergio J. Rodríguez; Omran, Pouya G.

arXiv 2024;():

2024

Ref ID: 8314

The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#566 - Jia 2022
The Method for Plausibility Evaluation of Knowledge Triple Based on QA

Jia, S. T.; Cao, J. X.

7th China Conference on Knowledge Graph and Semantic Computing (CCKS) 2022;1711():228-235

Qinhuangdao, PEOPLES R CHINA Springer International Publishing Ag 2022

DOI: 10.1007/978-981-19-8300-9_25 · Ref ID: 3376

At present, most of the methods for knowledge graph completion (KGC) task highly rely on external knowledge base or graph representation learning. However, how to complete this task without using any external prior knowledge is still a huge challenge and difficulty. To this end, we propose a novel framework which converts the plausibility evaluation of knowledge triple task to the question and answer (QA) task with the thought of KG-BERT and prompt learning. We also test the effect of different question types on the results. Secondly, by fine-tuning two pre-trained language models BERT-wwm-ext and ERNIE-Gram on these generated sequences, so that they can complete the QA task. We won the 5th place at CCKS 2022 track 1 rematch stage, which proved the effectiveness of our method.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#119 - Jiang 2023
COMBO: A Complete Benchmark for Open KG Canonicalization

Jiang, C. Y.; Jiang, Y.; Wu, W. Q.; Zheng, Y. T.; Xie, P. J.; Tu, K. W.

17th Conference of the European-Chapter of the Association-for-Computational-Linguistics (EACL) 2023;():340-357

Dubrovnik, CROATIA Assoc Computational Linguistics-Acl 2023

Ref ID: 3329

Open knowledge graph (KG) consists of (subject, relation, object) triples extracted from millions of raw text. The subject and object noun phrases and the relation in open KG have severe redundancy and ambiguity and need to be canonicalized. Existing datasets for open KG canonicalization only provide gold entitylevel canonicalization for noun phrases. In this paper, we present COMBO, a Complete Benchmark for Open KG canonicalization. Compared with existing datasets, we additionally provide gold canonicalization for relation phrases, gold ontology-level canonicalization for noun phrases, as well as source sentences from which triples are extracted. We also propose metrics for evaluating each type of canonicalization. On the COMBO dataset, we empirically compare previously proposed canonicalization methods as well as a few simple baseline methods based on pretrained language models. We find that properly encoding the phrases in a triple using pretrained language models results in better relation canonicalization and ontology-level canonicalization of the noun phrase. We release our dataset, baselines, and evaluation scripts at https://github.com/ jeffchy/COMBO/tree/main.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3948 - Jiang 2020
Understanding Contexts Inside Robot and Human Manipulation Tasks through a Vision-Language Model and Ontology System in a Video Stream

Jiang, Chen; Dehghan, Masood; Jagersand, Martin

arXiv 2020;():

2020

Ref ID: 7389

Manipulation tasks in daily life, such as pouring water, unfold intentionally under specialized manipulation contexts. Being able to process contextual knowledge in these Activities of Daily Living (ADLs) over time can help us understand manipulation intentions, which are essential for an intelligent robot to transition smoothly between various manipulation actions. In this paper, to model the intended concepts of manipulation, we present a vision dataset under a strictly constrained knowledge domain for both robot and human manipulations, where manipulation concepts and relations are stored by an ontology system in a taxonomic manner. Furthermore, we propose a scheme to generate a combination of visual attentions and an evolving knowledge graph filled with commonsense knowledge. Our scheme works with real-world camera streams and fuses an attention-based Vision-Language model with the ontology system. The experimental results demonstrate that the proposed scheme can successfully represent the evolution of an intended object manipulation procedure for both robots and humans. The proposed scheme allows the robot to mimic human-like intentional behaviors by watching real-time videos. We aim to develop this scheme further for real-world robot intelligence in Human-Robot Interaction.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3128 - Jiang 2024
Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models

Jiang, Feihu; Qin, Chuan; Yao, Kaichun; Fang, Chuyu; Zhuang, Fuzhen; Zhu, Hengshu; Xiong, Hui

Database Systems for Advanced Applications: 29th International Conference, DASFAA 2024, Gifu, Japan, July 2–5, 2024, Proceedings, Part IV 2024;():273–290

Gifu, Japan Springer-Verlag 2024

DOI: 10.1007/978-981-97-5562-2_18 · Ref ID: 7249

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3735 - Jiang 2024
Neuron-Level Sequential Editing for Large Language Models

Jiang, Houcheng; Fang, Junfeng; Zhang, Tianyu; Zhang, An; Wang, Ruipeng; Liang, Tao; Wang, Xiang

arXiv 2024;():

2024

Ref ID: 8663

This work explores sequential model editing in large language models (LLMs), a critical task that involves modifying internal knowledge within LLMs continuously through multi-round editing, each incorporating updates or corrections to adjust the model outputs without the need for costly retraining. Existing model editing methods, especially those that alter model parameters, typically focus on single-round editing and often face significant challenges in sequential model editing-most notably issues of model forgetting and failure. To address these challenges, we introduce a new model editing method, namely \textbf{N}euron-level \textbf{S}equential \textbf{E}diting (NSE), tailored for supporting sequential model editing. Specifically, we optimize the target layer's hidden states using the model's original weights to prevent model failure. Furthermore, we iteratively select neurons in multiple layers for editing based on their activation values to mitigate model forgetting. Our empirical experiments demonstrate that NSE significantly outperforms current modifying parameters model editing methods, marking a substantial advancement in the field of sequential model editing. Our code is released on \url{https://github.com/jianghoucheng/NSE}.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3716 - Jiang 2024
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

Jiang, Jinhao; Li, Junyi; Zhao, Wayne Xin; Song, Yang; Zhang, Tao; Wen, Ji-Rong

arXiv 2024;():

2024

Ref ID: 8462

Adapting general large language models (LLMs) to specialized domains presents great challenges due to varied data distributions. This adaptation typically requires continual pre-training on massive domain-specific corpora to facilitate knowledge memorization, followed by training to apply this knowledge following human instructions and preferences. However, this method may result in inefficient knowledge memorization due to a lack of awareness of knowledge utilization and imposes substantial demands on LLMs to simultaneously learn knowledge utilization and format alignment with limited training samples. To facilitate the domain adaptation of LLM, we revise this process and propose a new domain adaptation framework including domain knowledge learning and general format alignment, called Mix-CPT. Specifically, we first conduct a knowledge mixture continual pre-training that concurrently focuses on knowledge memorization and utilization, allowing for mutual reinforcement. To avoid catastrophic forgetting during the continual pre-training process, we further incorporate a logit swap self-distillation constraint. Subsequently, leveraging the knowledge and capabilities acquired during continual pre-training, we efficiently perform instruction tuning and alignment with a few general training samples to achieve format alignment. Extensive experiments demonstrate that our proposed Mix-CPT framework can simultaneously improve the task-solving capabilities of LLMs on the target and general domains compared to the traditional adaptation methods.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1004 - Jiang 2024
Augmenting NLP Models with Commonsense Knowledge

Jiang, M.; Lin, B. Y.; Wang, S.; Xu, Y.; Yu, W.; Zhu, C.

SpringerBriefs Comp. Sci. 2024;Part F2530():65-89

Springer 2024

DOI: 10.1007/978-981-97-0747-8_5 · Ref ID: 4308

This chapter focuses on augmenting NLP models with commonsense knowledge to enhance their performance in natural language understanding and generation tasks. We begin by discussing the importance of commonsense knowledge in NLP and the challenges faced by NLP models in reasoning with commonsense. We explore different types of commonsense knowledge and reasoning tasks, including multiple-choice tasks, open-ended QA, constrained NLG, and commonsense probing of language models. We then introduce the various techniques for augmenting NLP models with commonsense knowledge. We discuss the use of structured knowledge bases, such as ConceptNet, and the incorporation of graph networks for encoding structured knowledge. We also examine the augmentation of NLP models with un/semi-structured knowledge sources, such as text corpora and the use of dense passage retrieval for open-ended QA. Furthermore, we explore differentiable reasoning methods, such as DrFact, for reasoning with semi-structured knowledge. Finally, we discuss the use of neural knowledge models, such as COMET and LLMs, for incorporating commonsense knowledge. We explore the generation of commonsense knowledge graphs using LLMs and knowledge distillation techniques to create smaller, specialized commonsense models. We also examine the use of large language models for extracting relevant commonsense knowledge for reasoning. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1943 - Jiang 2023
Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language Models

Jiang, P.; Agarwal, S.; Jin, B.; Wang, X.; Sun, J.; Han, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():11161-11180

Association for Computational Linguistics (ACL) 2023

Ref ID: 5136

The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we propose TAGREAL that automatically generates quality query prompts and retrieves support information from large text corpora to probe knowledge from PLM for KG completion. The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets. We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods. © 2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1229 - Jiang 2024
Efficient Knowledge Infusion via KG-LLM Alignment

Jiang, Z.; Zhong, L.; Sun, M.; Xu, J.; Sun, R.; Cai, H.; Luo, S.; Zhang, Z.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():2986-2999

Association for Computational Linguistics (ACL) 2024

Ref ID: 4235

To tackle the problem of domain-specific knowledge scarcity within large language models (LLMs), knowledge graph-retrieval-augmented method has been proven to be an effective and efficient technique for knowledge infusion. However, existing approaches face two primary challenges: knowledge mismatch between public available knowledge graphs and the specific domain of the task at hand, and poor information compliance of LLMs with knowledge graphs. In this paper, we leverage a small set of labeled samples and a large-scale corpus to efficiently construct domain-specific knowledge graphs by an LLM, addressing the issue of knowledge mismatch. Additionally, we propose a three-stage KG-LLM alignment strategy to enhance the LLM's capability to utilize information from knowledge graphs. We conduct experiments with a limited-sample setting on two biomedical question-answering datasets, and the results demonstrate that our approach outperforms existing baselines. © 2024 Association for Computational Linguistics.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1290 - Jiayang 2024
EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs

Jiayang, C.; Qiu, L.; Chan, C.; Liu, X.; Song, Y.; Zhang, Z.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():6622-6642

European Language Resources Association (ELRA) 2024

Ref ID: 4568

Narrative reasoning relies on the understanding of eventualities in story contexts, which requires a wealth of background world knowledge. To help machines leverage such knowledge, existing solutions can be categorized into two groups. Some focus on implicitly modeling eventuality knowledge by pretraining language models (LMs) with eventuality-aware objectives. However, this approach breaks down knowledge structures and lacks interpretability. Others explicitly collect world knowledge of eventualities into structured eventuality-centric knowledge graphs (KGs). However, existing research on leveraging these knowledge sources for free-texts is limited. In this work, we propose an initial comprehensive framework called EventGround, which aims to tackle the problem of grounding free-texts to eventuality-centric KGs for contextualized narrative reasoning. We identify two critical problems in this direction: the event representation and sparsity problems. We provide simple yet effective parsing and partial information extraction methods to tackle these problems. Experimental results demonstrate that our approach consistently outperforms baseline models when combined with graph neural network (GNN) or large language model (LLM) based graph reasoning models. Our framework, incorporating grounded knowledge, achieves state-of-the-art performance while providing interpretable evidence. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#2454 - Jilani 2008
Formal Representations of the Data Flow Diagram: A Survey

Jilani, A. A. A.; Nadeem, A.; Kim, T. H.; Cho, E. S.

2008 Advanced Software Engineering and Its Applications 2008;():153-158

2008

DOI: 10.1109/ASEA.2008.34 · Ref ID: 6076

Structured analysis and design methodology has now been replaced by object oriented analysis and design software development techniques. A major design artifact in structured approach is the data flow diagram (DFD). DFD is very important for the modernization of old legacy systems. It is also very useful in requirement elicitation. However, DFD lacks formalism and by representing DFD formally, ambiguity and inconsistencies can be removed. Formal representation of DFD and its formal semantics help in better understanding of requirements and design. In this paper, we present a survey of techniques that formally represent or give formal semantics to the data flow diagram. We analyze formal representation techniques using analysis parameters. On the basis of identified parameters, we present an analysis table, which describes the strengths and weaknesses of representation techniques.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1582 - Jin 2024
Learning to Verify and Assure Cyber-Physical Systems

Jin, H.; Zhang, T.; Ramamurthy, A.; Hamza, A.; Malinoski, M.

AIAA SciTech Forum and Exposition, 2024 2024;():

American Institute of Aeronautics and Astronautics Inc, AIAA 2024

DOI: 10.2514/6.2024-1853 · Ref ID: 4527

Certification of aircraft systems is a complex task that is difficult to automate requiring significant subjective decision making. Established certification standards such as CFR-25 and MIL-HDBK-516 require considerable subjective analysis to transform the certification requirements into meaningful actionable requirements. This prevents the automation of any verification, assurance and certification task. While established methods rely on automation of assurance case generation, several key tasks that still requires human input preventing the co-creation of designs, verification scenarios, evidences, and assurance cases. Building on the recent success of large language models, we develop a framework that enables the automated verification and assurance of cyber-physical systems. Our framework consists of two parts – (i) an automated pipeline permits the automatic extraction of information from system artifacts such that a they can be semantically linked using a graphical representation; and (ii) an automated pipeline that enables the synthesis of verification scenarios to co-generate evidence along with the assurance case of the cyber-physical systems. We demonstrate our framework on a concrete example of a landing gear sub-system of the aircraft and highlight the benefits that can be realized through the automation of the bottlenecks in the task of verification, assurance, and certification. Using semantically linked representations of the knowledge, we enable complex reasoning on the knowledge contained in system artifacts providing meaningful feedback to the designers and certifiers on means improve the overall system. Our investigations on the landing gear use-case demonstrates the feasibility of the use of large language models to support the systems engineering tasks for cyber-physical systems and the use of a knowledge graphs in the construction and assessment of the cyber-physical system’s assurance. © 2024 by the American Institute of Aeronautics and Astronautics, Inc.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1535 - Jin 2022
A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding

Jin, Z.; Men, T.; Yuan, H.; Zhou, Y.; Cao, P.; Chen, Y.; Xue, Z.; Liu, K.; Zhao, J.

EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Demonstrations Session 2022;():1-11

Association for Computational Linguistics (ACL) 2022

Ref ID: 5489

As the first step of modern natural language processing, text representation encodes discrete texts as continuous embeddings. Pre-trained language models (PLMs) have demonstrated strong ability in text representation and significantly promoted the development of natural language understanding (NLU). However, existing PLMs represent a text solely by its context, which is not enough to support knowledge-intensive NLU tasks. Knowledge is power, and fusing external knowledge explicitly into PLMs can provide knowledgeable text representations. Since previous knowledge-enhanced methods differ in many aspects, making it difficult for us to reproduce previous methods, implement new methods, and transfer between different methods. It is highly desirable to have a unified paradigm to encompass all kinds of methods in one framework. In this paper, we propose, a knowledge-enhanced text representation toolkit for natural language understanding. According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks. Our unified, knowledgeable and modular toolkit is publicly available at GitHub, with an online system and a short instruction video. © 2022 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3500 - Jinensibieke 2024
How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation

Jinensibieke, Dawulie; Maimaiti, Mieradilijiang; Xiao, Wentao; Zheng, Yuanhang; Wang, Xiaobo

arXiv 2024;():

2024

Ref ID: 8393

Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are also utilized in the research field of RE. However, on low-resource languages (LRLs), both conventional RE methods and LLM-based methods perform poorly on RE due to the data scarcity issues. To this end, this paper constructs low-resource relation extraction datasets in 10 LRLs in three regions (Central Asia, Southeast Asia and Middle East). The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets. Finally, we conduct an empirical study and validate the performance of several open-source LLMs on these generated LRL RE datasets.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#615 - Jing 2022
A Novel Named Entity Recognition Algorithm for Hot Strip Rolling Based on BERT-Imseq2seq-CRF Model

Jing, F. W.; Zhang, M. Y.; Li, J.; Xu, G. Z.; Wang, J.

Appl. Sci.-Basel 2022;12(22):13

2022

DOI: 10.3390/app122211418 · Ref ID: 3363

Named entity recognition is not only the first step of text information extraction, but also the key process of constructing domain knowledge graphs. In view of the large amount of text data, complex process flow and urgent application needs in the hot strip rolling process, a novel named entity recognition algorithm based on BERT-Imseq2seq-CRF model is proposed in this paper. Firstly, the algorithm uses the BERT preprocessing language model to mine the dependencies in the domain text and obtain the corresponding representation vector. Then, the representation vector is sent to the encoder layer, and the output vector is input to the decoder at the same time, on the premise that the original model only considers the semantic vector. The Teacher-Forcing mechanism is integrated into the decoder layer to randomly modify the labeling results, and error accumulation is avoided to guarantee the sequence recognition effect. Finally, the validity of the labeling results is checked according to the conditional random field constraints, and the overall labeling quality of the algorithm is improved. The experimental results show that this model can efficiently and accurately predict the physical label of hot strip rolling, and the model performance index is better than other models, with the F1-Score reaching 91.47%. This model further provides technical support for information extraction and domain knowledge graph construction of hot strip rolling.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1772 - Jing 2023
Prompt-assisted Relation Fusion in Knowledge Graph Acquisition

Jing, X.; Rayz, J. M.

Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics 2023;():2960-2965

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/SMC53992.2023.10394554 · Ref ID: 4910

This paper investigated how prompt-based learning techniques can assist with relation fusion in Knowledge Graph (KG) acquisition. We created a unsupervised framework to generate a KG from a real-world dataset. The framework incorporates prompting with knowledge entity metadata and generating predicate embeddings with the pretrained Masked Language Model (MLM) RoBERTa. Predicate embeddings were clustered to form conceptual groups and feature tokens were used to derive relation labels. In addition, we conducted a comparative study on the effects of different prompting templates. The resulting relation labels were evaluated by human annotators, which indicated that prompt-based learning, if applied appropriately, can help with deducing conceptualized relations. Our framework proposed a way to improve the quality of KGs acquired using traditional Relation Extraction (RE). It can also assist human experts effectively in semi-automated knowledge acquisition. © 2023 IEEE.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#138 - Jovanovic 2023
Connecting AI: Merging Large Language Models and Knowledge Graph

Jovanovic, M.; Campbell, M.

Computer 2023;56(11):103-108

2023

DOI: 10.1109/mc.2023.3305206 · Ref ID: 2927

Combining the generative abilities of large language models with the logical and factual coherence of knowledge graphs using a connected artificial intelligence architecture minimizes each system's shortcomings and amplifies their strengths across many real-world domains.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#1437 - Ju 2024
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models

Ju, T.; Chen, Y.; Yuan, X.; Zhang, Z.; Du, W.; Zheng, Y.; Liu, G.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():8987-9001

Association for Computational Linguistics (ACL) 2024

Ref ID: 4380

Recent work has showcased the powerful capability of large language models (LLMs) in recalling knowledge and reasoning. However, the reliability of LLMs in combining these two capabilities into reasoning through multi-hop facts has not been widely explored. This paper systematically investigates the possibilities for LLMs to utilize shortcuts based on direct connections between the initial and terminal entities of multi-hop knowledge. We first explore the existence of factual shortcuts through Knowledge Neurons, revealing that: (i) the strength of factual shortcuts is highly correlated with the frequency of co-occurrence of initial and terminal entities in the pre-training corpora; (ii) few-shot prompting leverage more shortcuts in answering multi-hop questions compared to chain-of-thought prompting. Then, we analyze the risks posed by factual shortcuts from the perspective of multi-hop knowledge editing. Analysis shows that approximately 20% of the failures are attributed to shortcuts, and the initial and terminal entities in these failure instances usually have higher co-occurrences in the pre-training corpus. Finally, we propose erasing shortcut neurons to mitigate the associated risks and find that this approach significantly reduces failures in multiple-hop knowledge editing caused by shortcuts. Code is publicly available at https://github.com/Jometeorie/MultiHopShortcuts. © 2024 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#212 - Kahlawi 2024
Enhancing Administrative Source Registers for the Development of a Robust Large Language Model: A Novel Methodological Approach

Kahlawi, A.; Martelli, C.

Int. J. Adv. Comput. Sci. Appl. 2024;15(7):9-17

2024

Ref ID: 3073

Accurate statistical information is critical for understanding, describing, and managing socio-economic systems. While data availability has increased, often it does not meet the quality requirements for effective governance. Administrative registers are crucial for statistical information production, but their potential is hampered by quality issues stemming from administrative inconsistencies. This paper explores the integration of semantic technologies, including ontologies and knowledge graphs, with administrative databases to improve data quality. We discuss the development of large language models (LLMs) that enable a robust, queryable framework, facilitating the integration of disparate data sources. This approach ensures high-quality administrative data, essential for statistical reuse and the development of comprehensive, dynamic knowledge graphs and LLMs tailored for administrative applications.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3472 - Kalifa 2024
GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning

Kalifa, Dan; Singer, Uriel; Radinsky, Kira

arXiv 2024;():

2024

Ref ID: 8500

Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective learning process through graph-based learning. By doing so, we can capture complex relationships and dependencies between proteins and their functional annotations, resulting in more robust and contextually enriched protein representations. Unlike previous fusion methods, GOProteinGNN uniquely learns the entire protein knowledge graph during training, which allows it to capture broader relational nuances and dependencies beyond mere triplets as done in previous work. We perform a comprehensive evaluation on several downstream tasks demonstrating that GOProteinGNN consistently outperforms previous methods, showcasing its effectiveness and establishing it as a state-of-the-art solution for protein representation learning.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#479 - Kalo 2020
KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs

Kalo, J. C.; Fichtel, L.; Ehler, P.; Balke, W. T.

19th International Semantic Web Conference (ISWC) 2020;12506():294-310

Athens, GREECE Springer International Publishing Ag 2020

DOI: 10.1007/978-3-030-62419-4_17 · Ref ID: 2919

Providing a plethora of entity-centric information, Knowledge Graphs have become a vital building block for a variety of intelligent applications. Indeed, modern knowledge graphs like Wikidata already capture several billions of RDF triples, yet they still lack a good coverage for most relations. On the other hand, recent developments in NLP research show that neural language models can easily be queried for relational knowledge without requiring massive amounts of training data. In this work, we leverage this idea by creating a hybrid query answering system on top of knowledge graphs in combination with the masked language model BERT to complete query results. We thus incorporate valuable structural and semantic information from knowledge graphs with textual knowledge from language models to achieve high precision query results. Standard techniques for dealing with incomplete knowledge graphs are either (1) relation extraction which requires massive amounts of training data or (2) knowledge graph embeddings which have problems to succeed beyond simple baseline datasets. Our hybrid system KnowlyBERT requires only small amounts of training data, while outperforming state-of-the-art techniques by boosting their precision by over 30% in our large Wikidata experiment.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1616 - Kalo 2023
LM-KBC 2023: 2nd Challenge on Knowledge Base Construction from Pre-trained Language Models

Kalo, J. C.; Singhania, S.; Razniewski, S.; Pan, J. Z.

CEUR Workshop Proceedings 2023;3577():

CEUR-WS 2023

Ref ID: 5031

Large language models (LLMs) like chatGPT [1] have advanced a range of semantic tasks and are being ubiquitously used for knowledge extraction. Although several works have explored this ability by crafting prompts with in-context or instruction learning, the viability of complete and precise knowledge base construction from LMs is still in its nascent form. In the 2nd edition of this challenge, we invited participants to extract disambiguated knowledge triples from LMs for a given set of subjects and relations. In crucial difference to existing probing benchmarks like LAMA [2], we made no simplifying assumptions on relation cardinalities, i.e., a subject-entity can stand in relation with zero, one, or many object-entities. Furthermore, submissions needed to go beyond just ranking predicted surface strings, and materialize disambiguated entities in the output, which were evaluated using established KB metrics of precision, recall, and F1-score. The challenge had two tracks: (1) a small model track, where models with < 1 billion parameters could be probed, and (2) an open track, where participants could use any LM of their choice. We received seven submissions, two for track 1 and five for track 2. We present the contributions and insights of the submitted peer-reviewed submissions and lay out the possible paths for future work. All the details related to the challenge can be found on our website at https://lm-kbc.github.io/challenge2023/. © 2023 CEUR-WS. All rights reserved.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3723 - Kalyanpur 2024
Multi-step Inference over Unstructured Data

Kalyanpur, Aditya; Saravanakumar, Kailash Karthik; Barres, Victor; McFate, C. J.; Moon, Lori; Seifu, Nati; Eremeev, Maksim; Barrera, Jose; Bautista-Castillo, Abraham; Brown, Eric; Ferrucci, David

arXiv 2024;():

2024

Ref ID: 8423

The advent of Large Language Models (LLMs) and Generative AI has revolutionized natural language applications across various domains. However, high-stakes decision-making tasks in fields such as medical, legal and finance require a level of precision, comprehensiveness, and logical consistency that pure LLM or Retrieval-Augmented-Generation (RAG) approaches often fail to deliver. At Elemental Cognition (EC), we have developed a neuro-symbolic AI platform to tackle these problems. The platform integrates fine-tuned LLMs for knowledge extraction and alignment with a robust symbolic reasoning engine for logical inference, planning and interactive constraint solving. We describe Cora, a Collaborative Research Assistant built on this platform, that is designed to perform complex research and discovery tasks in high-stakes domains. This paper discusses the multi-step inference challenges inherent in such domains, critiques the limitations of existing LLM-based methods, and demonstrates how Cora's neuro-symbolic approach effectively addresses these issues. We provide an overview of the system architecture, key algorithms for knowledge extraction and formal reasoning, and present preliminary evaluation results that highlight Cora's superior performance compared to well-known LLM and RAG baselines.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#567 - Kaneda 2023
A Method to Constract a Masked Knowlege Graph Model using Transformer for Knowledge Graph Reasoning

Kaneda, R.; Okada, M.; Mori, N.; Ieee

17th IEEE International Conference on Semantic Computing (ICSC) 2023;():298-299

Laguna Hills, CA Ieee Computer Soc 2023

DOI: 10.1109/icsc56153.2023.00061 · Ref ID: 2966

Most of the previous methods using machine learning for this challenge generate a new knowledge graph from the original one, and some information is lost in the process of creating a new knowledge graph. Therefore, we proposed a new model to estimate the criminal without changing the original knowledge graph. The proposed model uses a Transformer and allows the estimation of unknown criminals in nonexistent scenes by learning similar to Masked Language Modeling in BERT. This model, which uses the original knowledge graph, is expected to infer information about the crime scene at the same time as predicting the criminal. We confirmed by experiments that the model had gained the ability to estimate the hidden story parts by considering the surrounding stories.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#467 - Kang 2024
Knowledge-aware adaptive graph network for commonsense question answering

Kang, L.; Li, X. G.; An, X. C.

J. Intell. Inf. Syst. 2024;62(5):1305-1324

2024

DOI: 10.1007/s10844-024-00854-z · Ref ID: 3207

Commonsense Question Answering (CQA) aims to select the correct answers to common knowledge questions. Most existing approaches focus on integrating external knowledge graph (KG) representations with question context representations to facilitate reasoning. However, the approaches cannot effectively select the correct answer due to (i) the incomplete reasoning chains when using knowledge graphs as external knowledge, and (ii) the insufficient understanding of semantic information of the question during the reasoning process. Here we propose a novel model, KA-AGN. First, we utilize a joint representation of dependency parse trees and language models to describe QA pairs. Next, we introduce question semantic information as nodes into a knowledge subgraph and compute the correlations between nodes using adaptive graph networks. Finally, bidirectional attention and graph pruning are employed to update the question representation and the knowledge subgraph representation. To evaluate the performance of our method, we conducted experiments on two widely used benchmark datasets: CommonsenseQA and OpenBookQA. The ablation experiment results demonstrate the effectiveness of the adaptive graph network in enhancing reasoning chains, while showing the ability of the joint representation of dependency parse trees and language models to correctly understand question semantics. Our code is publicly available at https://github.com/agfsghfdhg/KAAGN-main.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3610 - Kang 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Kang, Minki; Lee, Seanie; Baek, Jinheon; Kawaguchi, Kenji; Hwang, Sung Ju

arXiv 2023;():

2023

Ref ID: 7736

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small Language Models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales obtained from LLMs with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE, StrategyQA, and OpenbookQA. Notably, our method makes the 250M T5 models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3236 - Kang 2024
Bridging Law and Data: Augmenting Reasoning via a Semi-Structured Dataset with IRAC methodology

Kang, Xiaoxi; Qu, Lizhen; Soon, Lay-Ki; Li, Zhuang; Trakic, Adnan

arXiv 2024;():

2024

Ref ID: 8402

The effectiveness of Large Language Models (LLMs) in legal reasoning is often limited due to the unique legal terminologies and the necessity for highly specialized knowledge. These limitations highlight the need for high-quality data tailored for complex legal reasoning tasks. This paper introduces LEGALSEMI, a benchmark specifically curated for legal scenario analysis. LEGALSEMI comprises 54 legal scenarios, each rigorously annotated by legal experts, based on the comprehensive IRAC (Issue, Rule, Application, Conclusion) framework. In addition, LEGALSEMI is accompanied by a structured knowledge graph (SKG). A series of experiments were conducted to assess the usefulness of LEGALSEMI for IRAC analysis. The experimental results demonstrate the effectiveness of incorporating the SKG for issue identification, rule retrieval, application and conclusion generation using four different LLMs. LEGALSEMI will be publicly available upon acceptance of this paper.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3917 - Kannan 2024
A Timeline and Analysis for Representation Plasticity in Large Language Models

Kannan, Akshat

arXiv 2024;():

2024

Ref ID: 8676

The ability to steer AI behavior is crucial to preventing its long term dangerous and catastrophic potential. Representation Engineering (RepE) has emerged as a novel, powerful method to steer internal model behaviors, such as "honesty", at a top-down level. Understanding the steering of representations should thus be placed at the forefront of alignment initiatives. Unfortunately, current efforts to understand plasticity at this level are highly neglected. This paper aims to bridge the knowledge gap and understand how LLM representation stability, specifically for the concept of "honesty", and model plasticity evolve by applying steering vectors extracted at different fine-tuning stages, revealing differing magnitudes of shifts in model behavior. The findings are pivotal, showing that while early steering exhibits high plasticity, later stages have a surprisingly responsive critical window. This pattern is observed across different model architectures, signaling that there is a general pattern of model plasticity that can be used for effective intervention. These insights greatly contribute to the field of AI transparency, addressing a pressing lack of efficiency limiting our ability to effectively steer model behavior.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1349 - Karacapilidis 2024
Generative AI and Public Deliberation: A Framework for LLM-augmented Digital Democracy

Karacapilidis, N.; Kalampokis, E.; Giarelis, N.; Mastrokostas, C.

CEUR Workshop Proceedings 2024;3737():

CEUR-WS 2024

Ref ID: 4461

Aiming to augment the effectiveness and scalability of existing digital deliberation platforms, while also facilitating evidence-based collective decision making and increasing citizen participation and trust, this article (i) reviews state-of-the-art applications of LLMs in diverse public deliberation issues; (ii) proposes a novel digital deliberation framework that meaningfully incorporates Knowledge Graphs and neuro-symbolic reasoning approaches to improve the factual accuracy and reasoning capabilities of LLMs, and (iii) demonstrates the potential of the proposed solution through two key deliberation tasks, namely fact checking and argument building. The article provides insights about how modern AI technology should be used to address the equity perspective, helping citizens to construct robust and informed arguments, refine their prose, and contribute comprehensible feedback; and aiding policy makers in obtaining a deep understanding of the evolution and outcome of a deliberation. © 2024 Copyright for this paper by its authors.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#23 - Kardos 2023
Are These Descriptions Referring to the Same Entity or Just to Similar Ones?

Kardos, P.; Farkas, R.

19th International Conference on Artificial Intelligence Applications and Innovations (AIAI) 2023;676():387-398

Leon, SPAIN Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-34107-6_31 · Ref ID: 3151

The Knowledge Graph matching task is to identify nodes in the two graphs that refer to the same concept. In this paper, we focus on the analysis of textual descriptions of the concepts. We employ neural language models as they can score well on text content similarity On the other hand, we show that the text similarity of entity descriptions does not equal to referring to the exact same entity. Our text-based multi-step system was among the top participants at the Knowledge Graph matching track of the Ontology Alignment Evaluation Initiative.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#570 - Kasner 2023
Mind the Labels: Describing Relations in Knowledge GraphsWith Pretrained Models

Kasner, Z.; Konstas, I.; Dusek, O.

17th Conference of the European-Chapter of the Association-for-Computational-Linguistics (EACL) 2023;():2398-2415

Dubrovnik, CROATIA Assoc Computational Linguistics-Acl 2023

Ref ID: 3161

Pretrained language models (PLMs) for data-totext (D2T) generation can use human-readable data labels such as column headings, keys, or relation names to generalize to out-of-domain examples. However, the models are wellknown in producing semantically inaccurate outputs if these labels are ambiguous or incomplete, which is often the case in D2T datasets. In this paper, we expose this issue on the task of descibing a relation between two entities. For our experiments, we collect a novel dataset for verbalizing a diverse set of 1,522 unique relations from three large-scale knowledge graphs (Wikidata, DBPedia, YAGO). We find that although PLMs for D2T generation expectedly fail on unclear cases, models trained with a large variety of relation labels are surprisingly robust in verbalizing novel, unseen relations. We argue that using data with a diverse set of clear and meaningful labels is key to training D2T generation systems capable of generalizing to novel domains.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3268 - Kau 2024
Combining Knowledge Graphs and Large Language Models

Kau, Amanda; He, Xuzeng; Nambissan, Aishwarya; Astudillo, Aland; Yin, Hui; Aryani, Amir

arXiv 2024;():

2024

Ref ID: 8452

In recent years, Natural Language Processing (NLP) has played a significant role in various Artificial Intelligence (AI) applications such as chatbots, text generation, and language translation. The emergence of large language models (LLMs) has greatly improved the performance of these applications, showing astonishing results in language understanding and generation. However, they still show some disadvantages, such as hallucinations and lack of domain-specific knowledge, that affect their performance in real-world tasks. These issues can be effectively mitigated by incorporating knowledge graphs (KGs), which organise information in structured formats that capture relationships between entities in a versatile and interpretable fashion. Likewise, the construction and validation of KGs present challenges that LLMs can help resolve. The complementary relationship between LLMs and KGs has led to a trend that combines these technologies to achieve trustworthy results. This work collected 28 papers outlining methods for KG-powered LLMs, LLM-based KGs, and LLM-KG hybrid approaches. We systematically analysed and compared these approaches to provide a comprehensive overview highlighting key trends, innovative techniques, and common challenges. This synthesis will benefit researchers new to the field and those seeking to deepen their understanding of how KGs and LLMs can be effectively combined to enhance AI applications capabilities.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3751 - Ke 2024
oRetrieval Augmented Generation for 10 Large Language Models and its Generalizability in Assessing Medical Fitness

Ke, Yu He; Jin, Liyuan; Elangovan, Kabilan; Abdullah, Hairil Rizal; Liu, Nan; Sia, Alex Tiong Heng; Soh, Chai Rick; Tung, Joshua Yi Min; Ong, Jasmine Chiat Ling; Kuo, Chang-Fu; Wu, Shao-Chun; Kovacheva, Vesela P.; Ting, Daniel Shu Wei

arXiv 2024;():

2024

Ref ID: 8688

Large Language Models (LLMs) show potential for medical applications but often lack specialized clinical knowledge. Retrieval Augmented Generation (RAG) allows customization with domain-specific information, making it suitable for healthcare. This study evaluates the accuracy, consistency, and safety of RAG models in determining fitness for surgery and providing preoperative instructions. We developed LLM-RAG models using 35 local and 23 international preoperative guidelines and tested them against human-generated responses. A total of 3,682 responses were evaluated. Clinical documents were processed using Llamaindex, and 10 LLMs, including GPT3.5, GPT4, and Claude-3, were assessed. Fourteen clinical scenarios were analyzed, focusing on seven aspects of preoperative instructions. Established guidelines and expert judgment were used to determine correct responses, with human-generated answers serving as comparisons. The LLM-RAG models generated responses within 20 seconds, significantly faster than clinicians (10 minutes). The GPT4 LLM-RAG model achieved the highest accuracy (96.4% vs. 86.6%, p=0.016), with no hallucinations and producing correct instructions comparable to clinicians. Results were consistent across both local and international guidelines. This study demonstrates the potential of LLM-RAG models for preoperative healthcare tasks, highlighting their efficiency, scalability, and reliability.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2859 - Keber 2024
A Review on Neuro-symbolic AI Improvements to Natural Language Processing

Keber, M.; Grubišić, I.; Barešić, A.; Jović, A.

2024 47th MIPRO ICT and Electronics Convention (MIPRO) 2024;():66-72

2024

DOI: 10.1109/MIPRO60963.2024.10569741 · Ref ID: 6121

Symbolic artificial intelligence (AI) reflects the domain knowledge of experts and adheres to the logic of the subject area, rules, or any relations between entities. Connectionist (neuro) approaches based on artificial neural networks are excellent for extracting abstract features, contextualizing, and embedding interactions between features. When connectionist and symbolic approaches are properly aligned in a model, they benefit from complementary strengths; the combination is referred to as a hybrid or neuro-symbolic artificial intelligence (NSAI) model. The advantages that NSAI brings to the field of natural language processing (NLP) have received little attention from researchers in recent years. Therefore, in this review, we focus on the impact of neuro-symbolic approaches for NLP tasks, i.e. text classification, information extraction, machine translation, and language understanding. Relevant research articles from Scopus, Web of Science, and Google Scholar were carefully examined using appropriate keywords in the period from 2019 to 2024. The review aims to show the types of NSAI systems, identify the motivation for using NSAI, evaluate the use of additional annotations for content description, and briefly describe how the neuro-symbolic connection improves the methodology and enables trustworthy and explainable AI systems in current NLP research. The review also highlights areas of application and improvements achieved by NSAI approaches in benchmarks.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3670 - Khan 2024
LLM+KG@VLDB'24 Workshop Summary

Khan, Arijit; Wu, Tianxing; Chen, Xi

arXiv 2024;():

2024

Ref ID: 8651

The unification of large language models (LLMs) and knowledge graphs (KGs) has emerged as a hot topic. At the LLM+KG'24 workshop, held in conjunction with VLDB 2024 in Guangzhou, China, one of the key themes explored was important data management challenges and opportunities due to the effective interaction between LLMs and KGs. This report outlines the major directions and approaches presented by various speakers during the LLM+KG'24 workshop.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2586 - Kharitonov 2022
Intelligent Technologies for Projective Thinking and Research Management in the Knowledge Representation System

Kharitonov, V. A.; Krivogina, D. N.; Salamatina, A. S.; Guselnikova, E. D.; Spirina, V. S.; Markvirer, V. D.

2022 International Conference on Quality Management, Transport and Information Security, Information Technologies (IT&QM&IS) 2022;():292-295

2022

DOI: 10.1109/ITQMIS56172.2022.9976719 · Ref ID: 6044

It is proposed to address existing methodological issues in the educational process with the development of intellectual technologies and knowledge representation systems to improve the efficiency of higher education institutions. For this purpose, the structure of relational database is proposed, it will store the information about defended dissertations in the form of a set of attributes (heuristics), representing the mandatory qualification attributes of theses. An inference algorithm is proposed to process the information. This algorithm represents an artificial intelligence, its work is aimed at generating queries based on the applicant preferences. The result of the algorithm's work will be a set of choices, presented in ranked order. Given technologies will allow applicants to quickly become familiar with known scientific results and serve as a starting point for new research. The demand for co-researcher practice in solving the problem of updating the projective thinking methodology and managing the scientific research process has been justified. This article pays attention to the existing parallels between the concepts of technical and human sciences in the framework of their convergence. The concepts of being (economic good and economic utility) and the concepts of consciousness (humanitarian economic good and humanitarian economic utility) are used to form projective thinking. They form direct and inverse correspondences of technology and humanitarian practice in the techno-humanitarian mathematical space. It is proposed to place processed information from the language of context-free formal grammar dissertation abstracts in this space. The principle of data manipulation based on formal languages with context-free grammar allows to create new structures of subject areas in terms of applicants' preferences.It is believed that the success of applicants’ work depends directly on the cognitive training of applicants, which needs to be practiced psychologically. This practice is based on deepening the objectivity and adequacy qualities of obtaining information on the basis of heuristic methods. It requires increased attention and development of intelligence. The paper studies the use of heuristic methods by applicants to find new research directions leads to several promising results. These results can be perceived as potential options in future research. This contributes to an increase in the level of retention of higher education professionals.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3363 - Khlaut 2024
Efficient Medical Question Answering with Knowledge-Augmented Question Generation

Khlaut, Julien; Dancette, Corentin; Ferreres, Elodie; Bennani, Alaedine; Hérent, Paul; Manceron, Pierre

arXiv 2024;():

2024

Ref ID: 8312

In the expanding field of language model applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. Large language models, such as GPT-4, obtain reasonable scores on medical question answering tasks, but smaller models are far behind. In this work, we introduce a method to improve the proficiency of a small language model in the medical domain by employing a two-fold approach. We first fine-tune the model on a corpus of medical textbooks. Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model. Additionally, we introduce ECN-QA, a novel medical question answering dataset containing ``progressive questions'' composed of related sequential questions. We show the benefits of our training strategy on this dataset. The study's findings highlight the potential of small language models in the medical domain when appropriately fine-tuned. The code and weights are available at https://github.com/raidium-med/MQG.

Ishan voted
Xinchen voted
Final decision
What was the agreed final decision?

#1300 - Khorashadizadeh 2023
Exploring In-Context Learning Capabilities of Foundation Models for Generating Knowledge Graphs from Text

Khorashadizadeh, H.; Mihindukulasooriya, N.; Tiwari, S.; Groppe, J.; Groppe, S.

CEUR Workshop Proceedings 2023;3447():132-153

CEUR-WS 2023

Ref ID: 5243

Knowledge graphs can represent information about the real-world using entities and their relations in a structured and semantically rich manner and they enable a variety of downstream applications such as question-answering, recommendation systems, semantic search, and advanced analytics. However, at the moment, building a knowledge graph involves a lot of manual effort and thus hinders their application in some situations and the automation of this process might benefit especially for small organizations. Automatically generating structured knowledge graphs from a large volume of natural language is still a challenging task and the research on sub-tasks such as named entity extraction, relation extraction, entity and relation linking, and knowledge graph construction aims to improve the state of the art of automatic construction and completion of knowledge graphs from text. The recent advancement of foundation models with billions of parameters trained in a self-supervised manner with large volumes of training data that can be adapted to a variety of downstream tasks has helped to demonstrate high performance on a large range of Natural Language Processing (NLP) tasks. In this context, one emerging paradigm is in-context learning where a language model is used as it is with a prompt that provides instructions and some examples to perform a task without changing the parameters of the model using traditional approaches such as fine-tuning. This way, no computing resources are needed for re-training/fine-tuning the models and the engineering effort is minimal. Thus, it would be beneficial to utilize such capabilities for generating knowledge graphs from text. In this paper, grounded by several research questions, we explore the capabilities of foundation models such as ChatGPT to generate knowledge graphs from the knowledge it captured during pre-training as well as the new text provided to it in the prompt. The paper provides a qualitative analysis of a set of example outputs generated by a foundation model with the aim of knowledge graph construction and completion. The results demonstrate promising capabilities. Furthermore, we discuss the challenges and next steps for this research work. © 2023 CEUR-WS. All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3828 - Khorashadizadeh 2024
Research Trends for the Interplay between Large Language Models and Knowledge Graphs

Khorashadizadeh, Hanieh; Amara, Fatima Zahra; Ezzabady, Morteza; Ieng, Frédéric; Tiwari, Sanju; Mihindukulasooriya, Nandana; Groppe, Jinghua; Sahri, Soror; Benamara, Farah; Groppe, Sven

arXiv 2024;():

2024

Ref ID: 8379

This survey investigates the synergistic relationship between Large Language Models (LLMs) and Knowledge Graphs (KGs), which is crucial for advancing AI's capabilities in understanding, reasoning, and language processing. It aims to address gaps in current research by exploring areas such as KG Question Answering, ontology generation, KG validation, and the enhancement of KG accuracy and consistency through LLMs. The paper further examines the roles of LLMs in generating descriptive texts and natural language queries for KGs. Through a structured analysis that includes categorizing LLM-KG interactions, examining methodologies, and investigating collaborative uses and potential biases, this study seeks to provide new insights into the combined potential of LLMs and KGs. It highlights the importance of their interaction for improving AI applications and outlines future research directions.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#308 - Kim 2022
Generative Model Using Knowledge Graph for Document-Grounded Conversations

Kim, B.; Lee, D.; Kim, D.; Kim, H.; Kim, S.; Kwon, O.

Appl. Sci.-Basel 2022;12(7):10

2022

DOI: 10.3390/app12073367 · Ref ID: 3084

Featured Application Core technology for document-grounded conversation. Document-grounded conversation (DGC) is a natural language generation task to generate fluent and informative responses by leveraging dialogue history and document(s). Recently, DGCs have focused on fine-tuning using pretrained language models. However, these approaches have a problem in that they must leverage the background knowledge under capacity constraints. For example, the maximum length of the input is limited to 512 or 1024 tokens. This problem is fatal in DGC because most documents are longer than the maximum input length. To address this problem, we propose a document-grounded generative model using a knowledge graph. The proposed model converts knowledge sentences extracted from the given document(s) into knowledge graphs and fine-tunes the pretrained model using the graph. We validated the effectiveness of the proposed model using a comparative experiment on the well-known Wizard-of-Wikipedia dataset. The proposed model outperformed the previous state-of-the-art model in our experiments on the Doc2dial dataset.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1339 - Kim 2024
​Fusarium Protein Toolkit: a web-based resource for structural and variant analysis of Fusarium species

Kim, H. S.; Haley, O. C.; Portwood Ii, J. L.; Harding, S.; Proctor, R. H.; Woodhouse, M. R.; Sen, T. Z.; Andorf, C. M.

BMC Microbiol. 2024;24(1):

2024

DOI: 10.1186/s12866-024-03480-5 · Ref ID: 3824

Background: ​​The genus Fusarium poses significant threats to food security and safety worldwide because numerous species of the fungus cause destructive diseases and/or mycotoxin contamination in crops. The adverse effects of climate change are exacerbating some existing threats and causing new problems. These challenges highlight the need for innovative solutions, including the development of advanced tools to identify targets for control strategies. Description: In response to these challenges, we developed the Fusarium Protein Toolkit (FPT), a web-based tool that allows users to interrogate the structural and variant landscape within the Fusarium pan-genome. The tool displays both AlphaFold and ESMFold-generated protein structure models from six Fusarium species. The structures are accessible through a user-friendly web portal and facilitate comparative analysis, functional annotation inference, and identification of related protein structures. Using a protein language model, FPT predicts the impact of over 270 million coding variants in two of the most agriculturally important species, Fusarium graminearum and F. verticillioides. To facilitate the assessment of naturally occurring genetic variation, FPT provides variant effect scores for proteins in a Fusarium pan-genome based on 22 diverse species. The scores indicate potential functional consequences of amino acid substitutions and are displayed as intuitive heatmaps using the PanEffect framework. Conclusion: FPT fills a knowledge gap by providing previously unavailable tools to assess structural and missense variation in proteins produced by Fusarium. FPT has the potential to deepen our understanding of pathogenic mechanisms in Fusarium, and aid the identification of genetic targets for control strategies that reduce crop diseases and mycotoxin contamination. Such targets are vital to solving the agricultural problems incited by Fusarium, particularly evolving threats resulting from climate change. Thus, FPT has the potential to contribute to improving food security and safety worldwide. © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2024.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2393 - Kim 2013
Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features

Kim, J.; Hwang, S. w.; Jiang, L.; Song, Y. I.; Zhou, M.

IEEE Transactions on Knowledge and Data Engineering 2013;25(8):1787-1800

2013

DOI: 10.1109/TKDE.2012.117 · Ref ID: 6460

This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Motivated by this observation, we propose a new holistic approach by 1) combining all similarity types used and 2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. We also discover "latent" features lost in the graph extraction process and integrate this into our framework. According to our experimental results, our holistic graph-based approach and its enhancement using corpus latent features are highly effective and our framework significantly outperforms previous approaches.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1831 - Kim 2022
Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

Kim, T.

Proceedings - International Conference on Computational Linguistics, COLING 2022;29():5398-5408

Association for Computational Linguistics (ACL) 2022

Ref ID: 5348

Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. While attractive in the perspective that similar to in-context learning, it does not require task-specific fine-tuning, the practical effectiveness of such an approach still remains unclear, except that it can function as a probe for investigating language models’ inner workings. In this work, we mathematically reformulate CPE-PLM and propose two advanced ensemble methods tailored for it, demonstrating that the new parsing paradigm can be competitive with common unsupervised parsers by introducing a set of heterogeneous PLMs combined using our techniques. Furthermore, we explore some scenarios where the trees generated by CPE-PLM are practically useful. Specifically, we show that CPE-PLM is more effective than typical supervised parsers in few-shot settings. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#163 - Kim 2021
Deep Learning-Based Knowledge Graph Generation for COVID-19

Kim, T.; Yun, Y.; Kim, N.

Sustainability 2021;13(4):19

2021

DOI: 10.3390/su13042276 · Ref ID: 2983

Many attempts have been made to construct new domain-specific knowledge graphs using the existing knowledge base of various domains. However, traditional "dictionary-based" or "supervised" knowledge graph building methods rely on predefined human-annotated resources of entities and their relationships. The cost of creating human-annotated resources is high in terms of both time and effort. This means that relying on human-annotated resources will not allow rapid adaptability in describing new knowledge when domain-specific information is added or updated very frequently, such as with the recent coronavirus disease-19 (COVID-19) pandemic situation. Therefore, in this study, we propose an Open Information Extraction (OpenIE) system based on unsupervised learning without a pre-built dataset. The proposed method obtains knowledge from a vast amount of text documents about COVID-19 rather than a general knowledge base and add this to the existing knowledge graph. First, we constructed a COVID-19 entity dictionary, and then we scraped a large text dataset related to COVID-19. Next, we constructed a COVID-19 perspective language model by fine-tuning the bidirectional encoder representations from transformer (BERT) pre-trained language model. Finally, we defined a new COVID-19-specific knowledge base by extracting connecting words between COVID-19 entities using the BERT self-attention weight from COVID-19 sentences. Experimental results demonstrated that the proposed Co-BERT model outperforms the original BERT in terms of mask prediction accuracy and metric for evaluation of translation with explicit ordering (METEOR) score.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1329 - Kim 2024
Foundation Model for Biomedical Graphs: Integrating Knowledge Graphs and Protein Structures to Large Language Models

Kim, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;4():346-355

Association for Computational Linguistics (ACL) 2024

Ref ID: 4331

Transformer model has been a de-facto standard in natural language processing. Its adaptations in other fields such as computer vision showed promising results that this architecture is a powerful neural network in representation learning regardless of the data type. This recent success has led to research in multimodal Large Language Model (LLM), which enabled us to new types of tasks and applications with multiple data types. However, multimodal LLM in the biomedical domain is primarily limited to images, text, and/or sequence data. Here I propose to work on multimodal LLM architecture for biomedical graphs such as protein structure and chemical molecules. The research hypothesis is based on the fact that clinicians and researchers in computational biology and clinical research take advantage of various information for their decision-making process. Therefore, an AI model being able to handle multiple data types should boost its ability to use diverse knowledge for improved performances in clinical applications. ©2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#355 - Kirk 2024
Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis

Kirk, J. R.; Wray, R. E.; Lindes, P.; Laird, J. E.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18390-18398

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3747

Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach, STARS, that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences. The STARS approach is to increase the response space of LLMs and deploy general strategies, embedded within the autonomous agent, to evaluate, repair, and select among candidate responses produced by the LLM. We describe the approach and experiments that show how an agent, by retrieving and evaluating a breadth of responses from the LLM, can achieve 77- 94% task completion in one-shot learning without user oversight. The approach achieves 100% task completion when human oversight (such as an indication of preference) is provided. Further, the type of oversight largely shifts from explicit, natural language instruction to simple confirmation/discomfirmation of high-quality responses that have been vetted by the agent before presentation to a user.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3410 - Kirk 2023
Exploiting Language Models as a Source of Knowledge for Cognitive Agents

Kirk, James R.; Wray, Robert E.; Laird, John E.

arXiv 2023;():

2023

Ref ID: 7883

Large language models (LLMs) provide capabilities far beyond sentence completion, including question answering, summarization, and natural-language inference. While many of these capabilities have potential application to cognitive systems, our research is exploiting language models as a source of task knowledge for cognitive agents, that is, agents realized via a cognitive architecture. We identify challenges and opportunities for using language models as an external knowledge source for cognitive systems and possible ways to improve the effectiveness of knowledge extraction by integrating extraction with cognitive architecture capabilities, highlighting with examples from our recent work in this area.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1981 - Knez 2024
Towards using Automatically Enhanced Knowledge Graphs to Aid Temporal Relation Extraction

Knez, T.; Žitnik, S.

1st Workshop on Patient-Oriented Language Processing, CL4Health 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;():131-136

European Language Resources Association (ELRA) 2024

Ref ID: 4650

Temporal relation extraction in medical document analysis is crucial for understanding patient histories and treatment outcomes. This paper introduces a novel approach leveraging a bimodal model integrating textual content and a knowledge graph to enhance temporal relation extraction. The paper presents ongoing research on constructing an optimal knowledge graph by augmenting PrimeKG with dynamically expanded information using a language model-generated knowledge graph. It also further personalizes the information with patient-specific graphs tailored for relation prediction. The pipeline for constructing this enriched knowledge graph is detailed, aiming to improve the capabilities of temporal relation extraction models. The preliminary results show that adding a simple knowledge graph to the temporal relation extraction model can significantly increase the performance, achieving new state-of-the-art results. While research on enhanced knowledge graphs is ongoing, this paper lays the groundwork for leveraging common knowledge to advance temporal relation extraction in medical contexts. This approach holds promise for enhancing the understanding of patient histories and treatment outcomes, potentially leading to improved healthcare decision-making and patient care. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3408 - Ko 2024
Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering

Ko, Sungho; Cho, Hyunjin; Chae, Hyungjoo; Yeo, Jinyoung; Lee, Dongha

arXiv 2024;():

2024

Ref ID: 8161

Recent studies have investigated utilizing Knowledge Graphs (KGs) to enhance Quesetion Answering (QA) performance of Large Language Models (LLMs), yet structured KG verbalization remains challengin. Existing methods, such as triple-form or free-form textual conversion of triple-form facts, encounter several issues. These include reduced evidence density due to duplicated entities or relationships, and reduced evidence clarity due to an inability to emphasize crucial evidence. To address these issues, we propose EFSum, an Evidence-focused Fact Summarization framework for enhanced QA with knowledge-augmented LLMs. We optimize an open-source LLM as a fact summarizer through distillation and preference alignment. Our extensive experiments show that EFSum improves LLM's zero-shot QA performance, and it is possible to ensure both the helpfulness and faithfulness of the summary.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#444 - Koloski 2022
Knowledge graph informed fake news classification via heterogeneous representation ensembles

Koloski, B.; Perdih, T. S.; Robnik-Sikonja, M.; Pollak, S.; Skrlj, B.

Neurocomputing 2022;496():208-226

2022

DOI: 10.1016/j.neucom.2022.01.096 · Ref ID: 3053

Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness. An emerging problem in the modern era is fake news detection-many easily available pieces of information are not necessarily factually correct, and can lead to wrong conclusions or are used for manipulation. In this work we explore how different document representations, ranging from simple symbolic bag-of-words, to contextual, neural language model-based ones can be used for efficient fake news identification. One of the key contributions is a set of novel document representation learning methods based solely on knowledge graphs, i.e., extensive collections of (grounded) subject-predicate-object triplets. We demonstrate that knowledge graph-based representations already achieve competitive performance to conventionally accepted representation learners. Furthermore, when combined with existing, contextual representations, knowledge graph-based document representations can achieve state-of-the-art performance. To our knowledge this is the first larger-scale evaluation of how knowledge graph-based representations can be systematically incorporated into the process of fake news classification. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3440 - Kommineni 2024
From human experts to machines: An LLM supported approach to ontology and knowledge graph construction

Kommineni, Vamsi Krishna; König-Ries, Birgitta; Samuel, Sheeba

arXiv 2024;():

2024

Ref ID: 8179

The conventional process of building Ontologies and Knowledge Graphs (KGs) heavily relies on human domain experts to define entities and relationship types, establish hierarchies, maintain relevance to the domain, fill the ABox (or populate with instances), and ensure data quality (including amongst others accuracy and completeness). On the other hand, Large Language Models (LLMs) have recently gained popularity for their ability to understand and generate human-like natural language, offering promising ways to automate aspects of this process. This work explores the (semi-)automatic construction of KGs facilitated by open-source LLMs. Our pipeline involves formulating competency questions (CQs), developing an ontology (TBox) based on these CQs, constructing KGs using the developed ontology, and evaluating the resultant KG with minimal to no involvement of human experts. We showcase the feasibility of our semi-automated pipeline by creating a KG on deep learning methodologies by exploiting scholarly publications. To evaluate the answers generated via Retrieval-Augmented-Generation (RAG) as well as the KG concepts automatically extracted using LLMs, we design a judge LLM, which rates the generated content based on ground truth. Our findings suggest that employing LLMs could potentially reduce the human effort involved in the construction of KGs, although a human-in-the-loop approach is recommended to evaluate automatically generated KGs.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1009 - Kong 2024
Automated Knowledge Mining and Knowledge Graph Reasoning for Aircraft Engine Maintenance

Kong, X.; Li, Y.; Fan, M.; Shi, J.; Wei, L.; Qu, S.

ACM International Conference Proceeding Series 2024;():35-40

Association for Computing Machinery 2024

DOI: 10.1145/3689218.3689221 · Ref ID: 3837

The maintenance process for aircraft engines is fraught with significant challenges due to their inherent complexity. Large Language Models excel in general Natural Language Processing tasks, yet they lack domain-specific knowledge, thereby compromising their performance in specialized areas. The varied descriptions of engine faults also render traditional text matching algorithms unsuitable for this maintenance domain. In this paper, we construct a knowledge graph integrated with fault diagnosis reasoning ability with knowledge mined from aircraft engine maintenance data. Firstly, we propose the Knowledge Mining and Knowledge Graph Reasoning framework for aircraft engine maintenance data knowledge mining and aircraft engine fault diagnosis. Secondly, we utilize prompt with in-context learning to mitigate the issue of the model lacking expertise in the field of aircraft engine maintenance. Finally, we adopt a sentence similarity calculation method based on BERT, which enables more effective processing of semantic information. We apply our method to Aircraft Engine Fault dataset which is collected from maintenance records of civil aircraft engine since 2007 to 2015, and experimental results demonstrate the effectiveness of our knowledge mining method and aircraft engine fault reasoning algorithm. © 2024 ACM.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2017 - Koohborfardhaghighi 2024
Unlocking the Power of LLM-Based Question Answering Systems: Enhancing Reasoning, Insight, and Automation with Knowledge Graphs

Koohborfardhaghighi, S.; De Geyter, G.; Kaliner, E.

Lecture Notes in Networks and Systems 2024;1052 LNNS():156-171

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-3-031-64776-5_16 · Ref ID: 4409

In today’s data-driven business landscape, Knowledge Graphs can be effectively layered on top of relational databases and ontologies, a powerful combination for transforming how businesses tackle complex queries and decision-making processes. In this paper, we present a series of experiments that demonstrate the opportunities and advantages of blending knowledge graphs with Large Language Models (LLMs) through a practical use case. Our experimental results provide insights into the reasoning capabilities of LLMs when utilizing Knowledge Graph-Prompting. Furthermore, we observed the significance of maintaining uniformity in the language employed during knowledge graph construction to ensure precise responses from LLMs when querying the knowledge graph. This consistency also resonates in the embedding space of the model, where elements like relationship types are reflected in the resulting embeddings. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1096 - Korini 2023
Column Type Annotation using ChatGPT

Korini, K.; Bizer, C.

CEUR Workshop Proceedings 2023;3462():

CEUR-WS 2023

Ref ID: 5242

Column type annotation is the task of annotating the columns of a relational table with the semantic type of the values contained in each column. Column type annotation is an important pre-processing step for data search and data integration in the context of data lakes. State-of-the-art column type annotation methods either rely on matching table columns to properties of a knowledge graph or fine-tune pre-trained language models such as BERT for column type annotation. In this work, we take a different approach and explore using ChatGPT for column type annotation. We evaluate different prompt designs in zero- and few-shot settings and experiment with providing task definitions and detailed instructions to the model. We further implement a two-step table annotation pipeline which first determines the class of the entities described in the table and depending on this class asks ChatGPT to annotate columns using only the relevant subset of the overall vocabulary. Using instructions as well as the two-step pipeline, ChatGPT reaches F1 scores of over 85% in zero- and one-shot setups. To reach a similar F1 score a RoBERTa model needs to be fine-tuned with 356 examples. This comparison shows that ChatGPT is able deliver competitive results for the column type annotation task given no or only a minimal amount of task-specific demonstrations. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#1898 - Kosten 2023
Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems

Kosten, C.; Cudre-Mauroux, P.; Stockinger, K.

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():5272-5281

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BigData59044.2023.10386182 · Ref ID: 4933

With the recent spike in the number and availability of Large Language Models (LLMs), it has become increasingly important to provide large and realistic benchmarks for evaluating Knowledge Graph Question Answering (KGQA) systems. So far the majority of benchmarks rely on pattern-based SPARQL query generation approaches. The subsequent natural language (NL) question generation is conducted through crowdsourcing or other automated methods, such as rule-based paraphrasing or NL question templates. Although some of these datasets are of considerable size, their pitfall lies in their pattern-based generation approaches, which do not always generalize well to the vague and linguistically diverse questions asked by humans in real-world contexts. In this paper, we introduce Spider4SPARQL -a new SPARQL benchmark dataset featuring 9,693 previously existing manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity. In addition to the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. Our complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems. We evaluate the system with state-of-the-art KGQA systems as well as LLMs, which achieve only up to 45% execution accuracy, demonstrating that Spider4SPARQL is a challenging benchmark for future research. © 2023 IEEE.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#108 - Koto 2022
Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian

Koto, F.; Baldwin, T.; Lau, J. H.; Assoc Computat, Linguist

1st Workshop on Commonsense Representation and Reasoning (CSRR) 2022;():8-16

Dublin, IRELAND Assoc Computational Linguistics-Acl 2022

Ref ID: 3707

Story comprehension that involves complex causal and temporal relations is a critical task in NLP, but previous studies have focused predominantly on English, leaving open the question of how the findings generalize to other languages, such as Indonesian. In this paper, we follow the Story Cloze Test framework of Mostafazadeh et al. (2016) in evaluating story understanding in Indonesian, by constructing a four-sentence story with one correct ending and one incorrect ending. To investigate commonsense knowledge acquisition in language models, we experimented with: (1) a classification task to predict the correct ending; and (2) a generation task to complete the story with a single sentence. We investigate these tasks in two settings: (i) monolingual training and (ii) zero-shot cross-lingual transfer between Indonesian and English.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1103 - Krishnan 2020
Common-knowledge concept recognition for SEVA

Krishnan, J.; Coronado, P.; Purohit, H.; Rangwala, H.

CEUR Workshop Proceedings 2020;2600():

CEUR-WS 2020

Ref ID: 5792

We build a common-knowledge concept recognition system for a Systems Engineer’s Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dataset annotated at the word-level by carefully defining a labelling scheme to train a sequence model to recognize systems engineering concepts. We use a pre-trained language model and fine-tune it with the labeled dataset of concepts. In addition, we also create some essential datasets for information such as abbreviations and definitions from the systems engineering domain. Finally, we construct a simple knowledge graph using these extracted concepts along with some hyponym relations. Copyright © 2020 held by the author(s).

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3002 - Kroll 2021
A Toolbox for the Nearly-Unsupervised Construction of Digital Library Knowledge Graphs

Kroll, H.; Pirklbauer, J.; Balke, W. T.

2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2021;():21-30

2021

DOI: 10.1109/JCDL52503.2021.00014 · Ref ID: 6085

Knowledge graphs are essential for digital libraries to store entity-centric knowledge. The applications of knowledge graphs range from summarizing entity information over answering complex queries to inferring new knowledge. Yet, building knowledge graphs means either relying on manual curation or designing supervised extraction processes to harvest knowledge from unstructured text. Obviously, both approaches are cost-intensive. Yet, the question is whether we can minimize the efforts to build a knowledge graph. And indeed, we propose a toolbox that provides methods to extract knowledge from arbitrary text. Our toolkit bypasses the need for supervision nearly completely and includes a novel algorithm to close the missing gaps. As a practical demonstration, we analyze our toolbox on established biomedical benchmarks. As far as we know, we are the first who propose, analyze and share a nearly unsupervised and complete toolbox for building knowledge graphs from text.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1824 - Kruit 2024
Retrieval-based Question Answering with Passage Expansion using a Knowledge Graph

Kruit, B.; Xu, Y.; Kalo, J. C.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():14063-14072

European Language Resources Association (ELRA) 2024

Ref ID: 4600

Recent advancements in dense neural retrievers and language models have led to large improvements in state-of-the-art approaches to open-domain Question Answering (QA) based on retriever-reader architectures. However, issues stemming from data quality and imbalances in the use of dense embeddings have hindered performance, particularly for less common entities and facts. To tackle these problems, this study explores a multi-modal passage retrieval model's potential to bolster QA system performance. This study poses three key questions: (1) Can a distantly supervised question-relation extraction model enhance retrieval using a knowledge graph (KG), compensating for dense neural retrievers' shortcomings with rare entities? (2) How does this multi-modal approach compare to existing QA systems based on textual features? (3) Can this QA system alleviate poor performance on less common entities on common benchmarks? We devise a multi-modal retriever combining entity features and textual data, leading to improved retrieval precision in some situations, particularly for less common entities. Experiments across different datasets confirm enhanced performance for entity-centric questions, but challenges remain in handling complex generalized questions. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#2013 - Kuang 2023
Unleashing the Power of Language Models in Text-Attributed Graph

Kuang, H.; Xu, J.; Zhang, H.; Zhao, Z.; Zhang, Q.; Huang, X.; Wei, Z.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():8429-8441

Association for Computational Linguistics (ACL) 2023

Ref ID: 5076

Representation learning on graph has been demonstrated to be a powerful tool for solving real-world problems. Text-attributed graph carries both semantic and structural information among different types of graphs. Existing works have paved the way for knowledge extraction of this type of data by leveraging language models or graph neural networks or combination of them. However, these works suffer from issues like underutilization of relationships between nodes or words or unaffordable memory cost. In this paper, we propose a Node Representation Update Pre-training Architecture based on Co-modeling Text and Graph (NRUP). In NRUP, we construct a hierarchical text-attributed graph that incorporates both initial nodes and word nodes. Meanwhile, we apply four self-supervised tasks for different level of constructed graph. We further design the pre-training framework to update the features of nodes during training epochs. We conduct the experiment on the benchmark dataset ogbn-arxiv. Our method achieves outperformance compared to baselines, fully demonstrating its validity and generalization. © 2023 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1370 - Kuang 2024
Harnessing multimodal large language models for traffic knowledge graph generation and decision-making

Kuang, S.; Liu, Y.; Wang, X.; Wu, X.; Wei, Y.

Commun. Transp. Res. 2024;4():

2024

DOI: 10.1016/j.commtr.2024.100146 · Ref ID: 3895

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#749 - Kuegler 2022
A Semantic Annotation Pipeline towards the Generation of Knowledge Graphs in Tribology

Kuegler, P.; Marian, M.; Dorsch, R.; Schleich, B.; Wartzack, S.

Lubricants 2022;10(2):25

2022

DOI: 10.3390/lubricants10020018 · Ref ID: 3243

Within the domain of tribology, enterprises and research institutions are constantly working on new concepts, materials, lubricants, or surface technologies for a wide range of applications. This is also reflected in the continuously growing number of publications, which in turn serve as guidance and benchmark for researchers and developers. Due to the lack of suited data and knowledge bases, knowledge acquisition and aggregation is still a manual process involving the time-consuming review of literature. Therefore, semantic annotation and natural language processing (NLP) techniques can decrease this manual effort by providing a semi-automatic support in knowledge acquisition. The generation of knowledge graphs as a structured information format from textual sources promises improved reuse and retrieval of information acquired from scientific literature. Motivated by this, the contribution introduces a novel semantic annotation pipeline for generating knowledge in the domain of tribology. The pipeline is built on Bidirectional Encoder Representations from Transformers (BERT)-a state-of-the-art language model-and involves classic NLP tasks like information extraction, named entity recognition and question answering. Within this contribution, the three modules of the pipeline for document extraction, annotation, and analysis are introduced. Based on a comparison with a manual annotation of publications on tribological model testing, satisfactory performance is verified.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1088 - Kulkarni 2023
Cognitive Retrieve: Empowering Document Retrieval with Semantics and Domain-Specific Knowledge Graph

Kulkarni, A.; Ramanathan, C.; Venugopal, V. E.

CEUR Workshop Proceedings 2023;3532():

CEUR-WS 2023

Ref ID: 5196

As the data landscape continues to expand, the task of identifying relevant documents becomes increasingly complex, especially when dealing with diverse and varied data sources. Traditional keyword-based search systems struggle to capture the subtle contextual meaning of search queries. Semantic-based search, leveraging open data knowledge graphs, offers a solution by understanding contextual meaning. However, its effectiveness relies heavily on the quality and completeness of the underlying data used to define these semantics. However, incomplete data can lead to spurious results and a lack of relevance in the retrieved documents. To bridge this gap between user search interest and retrieval outcomes, we propose integrating domain-specific alignment into the search process. Our research aims to achieve this through the development of a semantic-driven data processing pipeline, laying the foundation for seamless semantic-oriented retrieval. This approach includes metadata extraction, considering domain-specific keywords and structural metadata from heterogeneous data sources. We enhance metadata by identifying latent terms using language models. Furthermore, we incorporate latent concepts and domain-specific information gathered from domain experts into a special knowledge graph construct- a ‘concept graph’. Our primary focus is on identifying relevant concepts from this graph, aligning with semantic and contextual aspects of the specified search intent. Our proposed document retrieval system, which combines the concept graph with semantics, is implemented using data from the Government of Karnataka, India. This approach addresses the administrative need to extract relevant documents from data silos, offering an alternative approach to traditional methods. Extensive evaluations demonstrate the proposed system’s superior performance in terms of true positive results compared to baseline systems like Lucene, Elasticsearch, and Doc2Vec. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3493 - Kulkarni 2024
HeCiX: Integrating Knowledge Graphs and Large Language Models for Biomedical Research

Kulkarni, Prerana Sanjay; Jain, Muskaan; Sheshanarayana, Disha; Parthiban, Srinivasan

arXiv 2024;():

2024

Ref ID: 8474

Despite advancements in drug development strategies, 90% of clinical trials fail. This suggests overlooked aspects in target validation and drug optimization. In order to address this, we introduce HeCiX-KG, Hetionet-Clinicaltrials neXus Knowledge Graph, a novel fusion of data from ClinicalTrials.gov and Hetionet in a single knowledge graph. HeCiX-KG combines data on previously conducted clinical trials from ClinicalTrials.gov, and domain expertise on diseases and genes from Hetionet. This offers a thorough resource for clinical researchers. Further, we introduce HeCiX, a system that uses LangChain to integrate HeCiX-KG with GPT-4, and increase its usability. HeCiX shows high performance during evaluation against a range of clinically relevant issues, proving this model to be promising for enhancing the effectiveness of clinical research. Thus, this approach provides a more holistic view of clinical trials and existing biological data.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3492 - Kulumba 2024
Harvesting Textual and Structured Data from the HAL Publication Repository

Kulumba, Francis; Antoun, Wissam; Vimont, Guillaume; Romary, Laurent

arXiv 2024;():

2024

Ref ID: 8496

HAL (Hyper Articles en Ligne) is the French national publication repository, used by most higher education and research organizations for their open science policy. As a digital library, it is a rich repository of scholarly documents, but its potential for advanced research has been underutilized. We present HALvest, a unique dataset that bridges the gap between citation networks and the full text of papers submitted on HAL. We craft our dataset by filtering HAL for scholarly publications, resulting in approximately 700,000 documents, spanning 34 languages across 13 identified domains, suitable for language model training, and yielding approximately 16.5 billion tokens (with 8 billion in French and 7 billion in English, the most represented languages). We transform the metadata of each paper into a citation network, producing a directed heterogeneous graph. This graph includes uniquely identified authors on HAL, as well as all open submitted papers, and their citations. We provide a baseline for authorship attribution using the dataset, implement a range of state-of-the-art models in graph representation learning for link prediction, and discuss the usefulness of our generated knowledge graph structure.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#403 - Kumar 2022
K-LM: Knowledge Augmenting in Language Models Within the Scholarly Domain

Kumar, V.; Recupero, D. R.; Helaoui, R.; Riboni, D.

IEEE Access 2022;10():91802-91815

2022

DOI: 10.1109/access.2022.3201542 · Ref ID: 3016

The use of superior algorithms and complex architectures in language models have successfully imparted human-like abilities to machines for specific tasks. But two significant constraints, the available training data size and the understanding of domain-specific context, hamper the pre-trained language models from optimal and reliable performance. A potential solution to tackle these limitations is to equip the language models with domain knowledge. While the commonly adopted techniques use Knowledge Graphs Embeddings (KGEs) to inject domain knowledge, we provide a Knowledge Language Model (K-LM) to use the Resource Description Framework (RDF) triples directly, extracted from world knowledge bases. The proposed model works in conjunction with Generative Pretrained Transformer (GPT-2) and Bidirectional Encoder Representations from Transformers (BERT) and uses a well-defined pipeline to select, categorize, and filter the RDF triples. In addition, we introduce heuristic methods to inject domain-specific knowledge in K-LM, leveraging knowledge graphs (KGs). We tested our approaches on the classification task within the scholarly domain using two KGs, and our results show that our proposed language model has significantly outperformed the baselines and BERT for each KG. Our experimental findings also help us conclude the importance of relevance of KG used over the quantity of injected RDF triples. Also, each of our proposed methods for injecting the RDF triples has increased the overall model's accuracy, demonstrating that K-LM is a potential choice for domain adaptation to solve knowledge-driven problems.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#558 - Kumichev 2024
MedSyn: LLM-Based Synthetic Medical Text Generation Framework

Kumichev, G.; Blinov, P.; Kuzkina, Y.; Goncharov, V.; Zubkova, G.; Zenovkin, N.; Goncharov, A.; Savchenko, A.

Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD) 2024;14950():215-230

Vilnius, LITHUANIA Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-70381-2_14 · Ref ID: 3384

Generating synthetic text addresses the challenge of data availability in privacy-sensitive domains such as healthcare. This study explores the applicability of synthetic data in real-world medical settings. We introduce MedSyn, a novel medical text generation framework that integrates large language models with a Medical Knowledge Graph (MKG). We use MKG to sample prior medical information for the prompt and generate synthetic clinical notes with GPT-4 and fine-tuned LLaMA models. We assess the benefit of synthetic data through application in the ICD code prediction task. Our research indicates that synthetic data can increase the classification accuracy of vital and challenging codes by up to 17.8% compared to settings without synthetic data. Furthermore, to provide new data for further research in the healthcare domain, we present the largest open-source synthetic dataset of clinical notes for the Russian language, comprising over 41k samples covering 219 ICD-10 codes.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3031 - Kunze 2011
Towards semantic robot description languages

Kunze, L.; Roehm, T.; Beetz, M.

2011 IEEE International Conference on Robotics and Automation 2011;():5589-5595

2011

DOI: 10.1109/ICRA.2011.5980170 · Ref ID: 6827

There is a semantic gap between simple but high-level action instructions like “Pick up the cup with the right hand” and low-level robot descriptions that model, for example, the structure and kinematics of a robot's manipulator. Currently, programmers bridge this gap by mapping abstract instructions to parametrized algorithms and rigid body parts of a robot within their control programs. By linking descriptions of robot components, i.e. sensors, actuators and control programs, via capabilities to actions in an ontology we equip robots with knowledge about themselves that allows them to infer the required components for performing a given action. Thereby a robot that is instructed by an end-user, a programmer, or even another robot to perform a certain action, can assess itself whether it is able and how to perform the requested action. This self-knowledge for robots could considerably change the way of robot control, robot interaction, robot programming, and multi-robot communication.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1688 - Kurdiukov 2024
nlp_enjoyers at TextGraphs-17 Shared Task: Text-Graph Representations for Knowledge Graph Question Answering using all-MPNet

Kurdiukov, N.; Zinkovich, V.; Karpukhin, S.; Tikhomirov, P.

TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():126-130

Association for Computational Linguistics (ACL) 2024

Ref ID: 4234

This paper presents a model for solving the Multiple Choice Question Answering (MCQA) problem, focusing on the impact of subgraph extraction from a Knowledge Graph on model performance. The proposed method combines textual and graph information by adding linearized subgraphs directly into the main question prompt with separate tokens, enhancing the performance of models working with each modality separately. The study also includes an examination of Large Language Model (LLM) backbones and the benefits of linearized subgraphs and sequence length, with efficient training achieved through fine-tuning with LoRA. The top benchmark, using subgraphs and MPNet, achieved an F1 score of 0.3887. The main limitation of the experiments is the reliance on pre-generated subgraphs/triplets from the graph, and the lack of exploration of in-context learning and prompting strategies with decoder-based architectures. © 2024 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1345 - Lageweg 2024
GECKO: A Question Answering System for Official Statistics

Lageweg, L.; Kouwenhoven, J.; Kruit, B.

CEUR Workshop Proceedings 2024;3759():

CEUR-WS 2024

Ref ID: 4246

This paper presents GECKO, a knowledge graph-based statistical question answering system currently in beta deployment.GECKO aims to facilitate the retrieval of single statistical values from an extensive database containing over a billion values across more than 4,000 tables.The system integrates a comprehensive framework including data augmentation, entity retrieval, and large language model (LLM)-based query generation.A key feature of the beta deployment is the collection of user feedback, which is critical for improving system performance and accuracy.This feedback mechanism allows users to report issues directly, ensuring continuous improvement based on real-world use. © 2024 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#307 - Lageweg 2024
Generative Expression Constrained Knowledge-Based Decoding for Open Data

Lageweg, L.; Kruit, B.

21st International Conference on The Semantic Web (ESWC) 2024;14664():307-325

Hersonissos, GREECE Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-60626-7_17 · Ref ID: 3304

In this paper, we present GECKO, a knowledge graph question answering (KGQA) system for data from Statistics Netherlands (Centraal Bureau voor de Statistiek). QA poses great challenges in means of generating relevant answers, as well as preventing hallucinations. This is a phenomenon found in language models and creates issues when attempting factual QA with these models alone. To overcome these limitations, the Statistics Netherlands' publicly available OData4 data was used to create a knowledge graph, in which the answer generation decoding process is grounded, ensuring faithful answers. When processing a question, GECKO performs entity and schema retrieval, does schema-constrained expression decoding, makes assumptions where needed and executes the generated expression as an OData4 query to retrieve information. A novel method was implemented to perform the constrained knowledge-based expression decoding using an encoder-decoder model. Both a sparse and dense entity retrieval method were evaluated. While the encoder-decoder model did not achieve production-ready performance, experiments show promising results for a rule-based baseline using a sparse entity retriever. Additionally, the results of qualitative user testing were positive. We therefore formulate recommendations for deployment help guide users of Statistics Netherlands data to their answers more quickly.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2421 - Lai 2021
Extracting Semantics of Predicates From Millions of Bio-Medical Abstracts for Inferencing New Biological Key Events and Relationships

Lai, C.; Martinović-Weigelt, D.; Filippo, A. S. D.; Krämer, S.; Poschen, C.

2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021;():3484-3491

2021

DOI: 10.1109/BIBM52615.2021.9669549 · Ref ID: 6201

Adverse outcome pathways (AOP) structure toxicological knowledge as sequential, directed chains of key events (KE) that culminate in adverse outcomes (AOs). AOP development is a laborious process that involves extensive knowledge mining and could be improved via use of machine / deep learning. In this paper, we present an artificial intelligence system that can accelerate putative AOP development process by inferencing new AOP modules based on the knowledge learned from 16-million pre-parsed PubMed abstracts. Each AOP modules is represented as a triplet that consists of antecedent and consequent biological entities connected by a relation. Users can also investigate specific types of antecedent, consequent, and relations by specifying macro/microtemplates using the MeSH semantic type hierarchy. We also provide visualizations to illustrate the hidden semantics that our system can extract from input triplets.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1792 - Laigle 2020
REEF: A framework for information extraction and automated knowledge graph construction

Laigle, J.; Collantes, C.; Cortis, A.; Jin, Z.; Bisset, A.

1st EAGE Digitalization Conference and Exhibition 2020;():

European Association of Geoscientists and Engineers, EAGE 2020

DOI: 10.3997/2214-4609.202032092 · Ref ID: 5787

We present Reef (Recursive Evidence Extraction Framework), a Python framework for automated information extraction from Petroleum Geoscience databases. Reef enables an end to end pipeline from raw documents to a Knowledge Graph. Reef makes possible two essential operations: 1/ discover entities in documents, characterize them and connect them to abstract concepts present in a knowledge graph and 2/ discover new knowledge with distant supervision. Knowledge graphs are key to build better search engines, Question Answering systems, recommendation engines, feed algorithms for the cross analysis of multiple datasets. Reef unique approach leverages a comprehensive stack of open source and state-of-the-art libraries for documents digitalization and parsing, Natural Language Processing, Language Modeling, Logic Reasoning and Graph Analysis. These foundational components are seconded by custom applications for specific tasks. Documents processed in Reef are digitized and sent through a pipeline where their content is filtered according to a flexible, easily extensible, Petroleum Geoscience specific object model. Information can be extracted from text, tables, figures, diagrams. Reef contains functions to infer information nature, digitize it, disambiguate and reconcile it into a graph database. Reef can be deployed in any cloud and delivers production ready knowledge graphs which can be served to third party applications. © EAGE 2019.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3555 - Lairgi 2024
iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models

Lairgi, Yassir; Moncla, Ludovic; Cazabet, Rémy; Benabdeslem, Khalid; Cléau, Pierre

arXiv 2024;():

2024

Ref ID: 8579

Most available data is unstructured, making it challenging to access valuable information. Automatically building Knowledge Graphs (KGs) is crucial for structuring data and making it accessible, allowing users to search for information effectively. KGs also facilitate insights, inference, and reasoning. Traditional NLP methods, such as named entity recognition and relation extraction, are key in information retrieval but face limitations, including the use of predefined entity types and the need for supervised learning. Current research leverages large language models' capabilities, such as zero- or few-shot learning. However, unresolved and semantically duplicated entities and relations still pose challenges, leading to inconsistent graphs and requiring extensive post-processing. Additionally, most approaches are topic-dependent. In this paper, we propose iText2KG, a method for incremental, topic-independent KG construction without post-processing. This plug-and-play, zero-shot method is applicable across a wide range of KG construction scenarios and comprises four modules: Document Distiller, Incremental Entity Extractor, Incremental Relation Extractor, and Graph Integrator and Visualization. Our method demonstrates superior performance compared to baseline methods across three scenarios: converting scientific papers to graphs, websites to graphs, and CVs to graphs.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1591 - Laleye 2023
Leveraging Knowledge Graph Embeddings to Enhance Contextual Representations for Relation Extraction

Laleye, F. A. A.; Rakotoson, L.; Massip, S.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14194 LNCS():19-31

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-41501-2_2 · Ref ID: 5286

Relation extraction task is a crucial and challenging aspect of Natural Language Processing. Several methods have surfaced as of late, exhibiting notable performance in addressing the task; however, most of these approaches rely on vast amounts of data from large-scale knowledge graphs or language models pretrained on voluminous corpora. In this paper, we hone in on the effective utilization of solely the knowledge supplied by a corpus to create a high-performing model. Our objective is to showcase that by leveraging the hierarchical structure and relational distribution of entities within a corpus without introducing external knowledge, a relation extraction model can achieve significantly enhanced performance. We therefore proposed a relation extraction approach based on the incorporation of pretrained knowledge graph embeddings at the corpus scale into the sentence-level contextual representation. We conducted a series of experiments which revealed promising and very interesting results for our proposed approach. The obtained results demonstrated an outperformance of our method compared to context-based relation extraction models. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1614 - Lan 2024
LLM4QA: Leveraging Large Language Model for Efficient Knowledge Graph Reasoning with SPARQL Query

Lan, M.; Xia, Y.; Zhou, G.; Huang, N.; Li, Z.; Wu, H.

J. Adv. Inf. Technol. 2024;15(10):1157-1162

2024

DOI: 10.12720/jait.15.10.1157-1162 · Ref ID: 4154

As one of the core technologies of general artificial intelligence, knowledge graph reasoning aims to infer new knowledge from existing knowledge in the knowledge base, providing decision support for knowledge-driven intelligent information services such as information retrieval, question answering, and recommendation systems. However, there are still some issues, such as poor interpretability and low reasoning efficiency, always decrease the current knowledge reasoning performance. To tackle the challenges, this paper proposes a knowledge graph reasoning method LLM4QA, which leverages fine-tuned large language models with chain-of-thought to generate graph query languages SPARQL (i.e., SPARQL Protocol and RDF Query Language) for reasoning. Firstly, an efficient instruction fine-tuning method is applied to fine-tune open-source large language models with chain-of-thought. Then, the fine-tuned open-source large model is used to convert natural language questions into logical forms. Finally, we utilize unsupervised entity relationship retrieval to generate graph database query languages, real-izing a natural language knowledge graph question-answering framework. Experimental results demonstrate that this method achieves well performance in terms of inference accuracy and significantly improves model retrieval efficiency. © 2024 by the authors.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#157 - Lanchantin 2023
A Data Source for Reasoning Embodied Agents

Lanchantin, J.; Sukhbaatar, S.; Synnaeve, G.; Sun, Y. X.; Srinet, K.; Szlam, A.

37th AAAI Conference on Artificial Intelligence (AAAI) / 35th Conference on Innovative Applications of Artificial Intelligence / 13th Symposium on Educational Advances in Artificial Intelligence 2023;():8438-8446

Washington, DC Assoc Advancement Artificial Intelligence 2023

Ref ID: 3390

Recent progress in using machine learning models for reasoning tasks has been driven by novel model architectures, large-scale pre-training protocols, and dedicated reasoning datasets for fine-tuning. In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent. The generated data consists of templated text queries and answers, matched with world-states encoded into a database. The world-states are a result of both world dynamics and the actions of the agent. We show the results of several baseline models on instantiations of train sets. These include pre-trained language models fine-tuned on a text-formatted representation of the database, and graph-structured Transformers operating on a knowledge-graph representation of the database. We find that these models can answer some questions about the world-state, but struggle with others. These results hint at new research directions in designing neural reasoning models and database representations. Code to generate the data and train the models will be released at github.com/facebookresearch/neuralmemory.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2221 - Lau 2016
CASPR: A comprehensive cable-robot analysis and simulation platform for the research of cable-driven parallel robots

Lau, D.; Eden, J.; Tan, Y.; Oetomo, D.

2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016;():3004-3011

2016

DOI: 10.1109/IROS.2016.7759465 · Ref ID: 6973

The study of cable-driven parallel robots (CDPRs) has attracted much attention in recent years. However, to the best of the authors' knowledge, no single software platform exists for researchers to perform different types of analyses for CDPRs of arbitrary structure. In this paper, the Cable-robot Analysis and Simulation Platform for Research (CASPR) of CDPRs is introduced. Using this platform, arbitrary types and structures of CDPRs, such as single and multi-link CDPRs, can be studied for a wide range of analyses, including kinematics, dynamics, control and workspace analysis. CASPR achieves this using a general CDPR model representation and an abstracted software architecture. Moveover, CDPRs can be defined using Extensible Markup Language (XML) with out-of-the-box availability of an extensive range of robots and analysis tools. The open-source platform aims to provide both a communal environment for the researchers to use and add models and algorithms to. The example case studies demonstrate the potential to perform analysis on CDPRs, directly compare algorithms and conveniently add new models and analyses.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#19 - Lawley 2023
Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling

Lawley, C. J. M.; Gadd, M. G.; Parsa, M.; Lederer, G. W.; Graham, G. E.; Ford, A.

Nat. Resour. Res. 2023;32(4):1503-1527

2023

DOI: 10.1007/s11053-023-10216-1 · Ref ID: 3561

Geological maps are powerful models for visualizing the complex distribution of rock types through space and time. However, the descriptive information that forms the basis for a preferred map interpretation is typically stored in geological map databases as unstructured text data that are difficult to use in practice. Herein we apply natural language processing (NLP) to geoscientific text data from Canada, the U.S., and Australia to address that knowledge gap. First, rock descriptions, geological ages, lithostratigraphic and lithodemic information, and other long-form text data are translated to numerical vectors, i.e., a word embedding, using a geoscience language model. Network analysis of word associations, nearest neighbors, and principal component analysis are then used to extract meaningful semantic relationships between rock types. We further demonstrate using simple Naive Bayes classifiers and the area under receiver operating characteristics plots (AUC) how word vectors can be used to: (1) predict the locations of "pegmatitic" (AUC = 0.962) and "alkalic" (AUC = 0.938) rocks; (2) predict mineral potential for Mississippi-Valley-type (AUC = 0.868) and clastic-dominated (AUC = 0.809) Zn-Pb deposits; and (3) search geoscientific text data for analogues of the giant Mount Isa clastic-dominated Zn-Pb deposit using the cosine similarities between word vectors. This form of semantic search is a promising NLP approach for assessing mineral potential with limited training data. Overall, the results highlight how geoscience language models and NLP can be used to extract new knowledge from unstructured text data and reduce the mineral exploration search space for critical raw materials.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#311 - Lawley 2022
Geoscience language models and their intrinsic evaluation

Lawley, C. J. M.; Raimondo, S.; Chen, T. Y.; Brin, L.; Zakharov, A.; Kur, D.; Hui, J. Y.; Newton, G.; Burgoyne, S. L.; Marquis, G.

Appl. Comput. Geosci. 2022;14():10

2022

DOI: 10.1016/j.acags.2022.100084 · Ref ID: 3340

Geoscientists use observations and descriptions of the rock record to study the origins and history of our planet, which has resulted in a vast volume of scientific literature. Recent progress in natural language processing (NLP) has the potential to parse through and extract knowledge from unstructured text, but there has, so far, been only limited work on the concepts and vocabularies that are specific to geoscience. Herein we harvest and process public geoscientific reports (i.e., Canadian federal and provincial geological survey publications databases) and a subset of open access and peer-reviewed publications to train new, geoscience-specific language models to address that knowledge gap. Language model performance is validated using a series of new geoscience-specific NLP tasks (i.e., analogies, clustering, relatedness, and nearest neighbour analysis) that were developed as part of the current study. The raw and processed national geological survey corpora, language models, and evaluation criteria are all made public for the first time. We demonstrate that non-contextual (i.e., Global Vectors for Word Representation, GloVe) and contextual (i.e., Bidirectional Encoder Representations from Transformers, BERT) language models updated using the geoscientific corpora outperform the generic versions of these models for each of the evaluation criteria. Principal component analysis further demonstrates that word embeddings trained on geoscientific text capture meaningful semantic relationships, including rock classifications, mineral properties and compositions, and the geochemical behaviour of elements. Semantic relationships that emerge from the vector space have the potential to unlock latent knowledge within unstructured text, and perhaps more importantly, also highlight the potential for other downstream geoscience-focused NLP tasks (e.g., keyword prediction, document similarity, recommender systems, rock and mineral classification).

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1363 - Le 2024
GraphLingo: Domain Knowledge Exploration by Synchronizing Knowledge Graphs and Large Language Models

Le, D.; Zhao, K.; Wang, M.; Wu, Y.

Proceedings - International Conference on Data Engineering 2024;():5477-5480

IEEE Computer Society 2024

DOI: 10.1109/ICDE60146.2024.00432 · Ref ID: 4422

Knowledge graphs (KGs) are routinely curated to provide factual data for various domain-specific analyses. Nevertheless, it remains nontrivial to explore domain knowledge with standard query languages. We demonstrate GraphLingo, a natural language (NL)-based knowledge exploration system designed for exploring domain-specific knowledge graphs. It differs from conventional knowledge graph search tools in that it enables an interactive exploratory NL query over domain-specific knowledge graphs. GraphLingo seamlessly integrates graph query processing and large language models with a graph pattern-based prompt generation approach to guide users in exploring relevant factual knowledge. It streamlines NL-based question & answer, graph query optimization & refining, and automatic prompt generation. A unique feature of GraphLingo is its capability to enable users to explore by seamlessly switching between a more 'open' approach and a more relevant yet 'conservative' one, facilitated by diversified query suggestions. We show cases of GraphLingo in curriculum suggestion, and materials scientific data search. © 2024 IEEE.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3982 - Le-Duc 2024
wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech

Le-Duc, Khai; Dang, Quy-Anh; Pham, Tan-Hanh; Hy, Truong-Son

arXiv 2024;():

2024

Ref ID: 8519

Knowledge graphs (KGs) enhance the performance of large language models (LLMs) and search engines by providing structured, interconnected data that improves reasoning and context-awareness. However, KGs only focus on text data, thereby neglecting other modalities such as speech. In this work, we introduce wav2graph, the first framework for supervised learning knowledge graph from speech data. Our pipeline are straightforward: (1) constructing a KG based on transcribed spoken utterances and a named entity database, (2) converting KG into embedding vectors, and (3) training graph neural networks (GNNs) for node classification and link prediction tasks. Through extensive experiments conducted in inductive and transductive learning contexts using state-of-the-art GNN models, we provide baseline results and error analysis for node classification and link prediction tasks on human transcripts and automatic speech recognition (ASR) transcripts, including evaluations using both encoder-based and decoder-based node embeddings, as well as monolingual and multilingual acoustic pre-trained models. All related code, data, and models are published online.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#450 - Lee 2023
Knowledge Graph-based Genetic Fuzzy Agent for Human Intelligence and Machine Co-Learning

Lee, C. S.; Wang, M. H.; Chen, C. Y.; Reformat, M.; Nojima, Y.; Kubota, N.; Ieee

IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) 2023;():

Incheon, SOUTH KOREA Ieee 2023

DOI: 10.1109/fuzz52849.2023.10309699 · Ref ID: 3035

This paper proposes a novel approach for evaluating the co-learning performance of human intelligence ( HI) and machine intelligence (MI) using a Knowledge Graph-based genetic fuzzy agent. The agent utilizes a Knowledge Graph structure to represent a specific knowledge domain related to human learning and employs a genetic fuzzy learning mechanism to construct a personalized learning model. Human learners can engage in co-learning with machines using state-of-the-art AI tools such as the Meta AI S2ST Taiwanese-English language model and the OpenAI ChatGPT text model. The proposed approach was evaluated using human learning data from an undergraduate computer science course and a series of Taiwanese and English language translation experience activities. The experimental results indicate that the proposed approach can effectively enhance the co-learning process for both human and machine learners.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1936 - Lee 2023
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning

Lee, D. H.; Ahrabian, K.; Jin, W.; Morstatter, F.; Pujara, J.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():544-557

Association for Computational Linguistics (ACL) 2023

Ref ID: 5014

Temporal knowledge graph (TKG) forecasting benchmarks challenge models to predict future facts using knowledge of past facts. In this paper, we develop an approach to use in-context learning (ICL) with large language models (LLMs) for TKG forecasting. Our extensive evaluation compares diverse baselines, including both simple heuristics and state-of-the-art (SOTA) supervised models, against pre-trained LLMs across several popular benchmarks and experimental settings. We observe that naive LLMs perform on par with SOTA models, which employ carefully designed architectures and supervised training for the forecasting task, falling within the (-3.6%, +1.5%) Hits@1 margin relative to the median performance. To better understand the strengths of LLMs for forecasting, we explore different approaches for selecting historical facts, constructing prompts, controlling information propagation, and parsing outputs into a probability distribution. A surprising finding from our experiments is that LLM performance endures (±0.4% Hit@1) even when semantic information is removed by mapping entities/relations to arbitrary numbers, suggesting that prior semantic knowledge is unnecessary; rather, LLMs can leverage the symbolic patterns in the context to achieve such a strong performance. Our analysis also reveals that ICL enables LLMs to learn irregular patterns from the historical context, going beyond frequency and recency biases. ©2023 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#3267 - Lee 2024
Collaboratively adding new knowledge to an LLM

Lee, Rhui Dih; Wynter, Laura

arXiv 2024;():

2024

Ref ID: 8731

We address the question of how to successively add new knowledge to an LLM whilst retaining previously-added knowledge. We consider two settings, semi-cooperative and fully-cooperative. Overall, LoRA performs better in most cases than full-fine tuning of all parameters when both new knowledge acquisition and retention of old, including recent, knowledge are taken into account. In the semi-cooperative setting, where datasets are not available after training, MOE mixing, model merging, and LoRA-based orthogonal subspace sequential learning, using a small weight on the orthogonality term, perform well. In the fully-cooperative setting where datasets remain available, joint training and sequential training with replay are both effective approaches with LoRA training generally preferable to full fine-tuning. The codes needed to reproduce the results are provided in an open source repository.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2004 - Levy 2023
Understanding Natural Language in Context

Levy, A.; Karpas, E.

Proceedings International Conference on Automated Planning and Scheduling, ICAPS 2023;33():659-667

Association for the Advancement of Artificial Intelligence 2023

DOI: 10.1609/icaps.v33i1.27248 · Ref ID: 5225

Recent years have seen an increasing number of applications that have a natural language interface, either in the form of chatbots or via personal assistants such as Alexa (Amazon), Google Assistant, Siri (Apple), and Cortana (Microsoft). To use these applications, a basic dialog between the assistant and the human is required. While this kind of dialog exists today mainly within static robots that do not make any movement in the household space, the challenge of reasoning about the information conveyed by the environment increases significantly when dealing with robots that can move and manipulate objects in our home environment. In this paper, we focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model. Thus, when the robot and the human communicate, there is already some formalism they can use - the robot's knowledge representation formalism. In this paper we describe an approach for translating natural language directives into the robot's formalism, allowing much more complicated household tasks to be completed. We do so by combining off-the-shelf SoTA large language models, planning tools, and the robot knowledge of the state of the world and of its own model. This results in much more accurate interpretation of directives in natural language. Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#471 - Li 2019
Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation

Li, C. Y.; Liang, X. D.; Hu, Z. T.; Xing, E. P.; Aaai

33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence 2019;():6666-6673

Honolulu, HI Assoc Advancement Artificial Intelligence 2019

Ref ID: 3515

Generating long and semantic-coherent reports to describe medical images poses great challenges towards bridging visual and linguistic modalities, incorporating medical domain knowledge, and generating realistic and accurate descriptions. We propose a novel Knowledge-driven Encode, Retrieve, Paraphrase (KERP) approach which reconciles traditional knowledge- and retrieval-based methods with modern learning-based methods for accurate and robust medical report generation. Specifically, KERP decomposes medical report generation into explicit medical abnormality graph learning and subsequent natural language modeling. KERP first employs an Encode module that transforms visual features into a structured abnormality graph by incorporating prior medical knowledge; then a Retrieve module that retrieves text templates based on the detected abnormalities; and lastly, a Paraphrase module that rewrites the templates according to specific cases. The core of KERP is a proposed generic implementation unit-Graph Transformer (GTR) that dynamically transforms high-level semantics between graph-structured data of multiple domains such as knowledge graphs, images and sequences. Experiments show that the proposed approach generates structured and robust reports supported with accurate abnormality description and explainable attentive regions, achieving the state-of-the-art results on two medical report benchmarks, with the best medical abnormality and disease classification accuracy and improved human evaluation performance.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3524 - Li 2024
Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems

Li, Chuang; Deng, Yang; Hu, Hengchang; Kan, Min-Yen; Li, Haizhou

arXiv 2024;():

2024

Ref ID: 8273

This paper aims to efficiently enable large language models (LLMs) to use external knowledge and goal guidance in conversational recommender system (CRS) tasks. Advanced LLMs (e.g., ChatGPT) are limited in domain-specific CRS tasks for 1) generating grounded responses with recommendation-oriented knowledge, or 2) proactively leading the conversations through different dialogue goals. In this work, we first analyze those limitations through a comprehensive evaluation, showing the necessity of external knowledge and goal guidance which contribute significantly to the recommendation accuracy and language quality. In light of this finding, we propose a novel ChatCRS framework to decompose the complex CRS task into several sub-tasks through the implementation of 1) a knowledge retrieval agent using a tool-augmented approach to reason over external Knowledge Bases and 2) a goal-planning agent for dialogue goal prediction. Experimental results on two multi-goal CRS datasets reveal that ChatCRS sets new state-of-the-art benchmarks, improving language quality of informativeness by 17% and proactivity by 27%, and achieving a tenfold enhancement in recommendation accuracy.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#831 - Li 2020
Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text

Li, D. F.; Hu, B. T.; Chen, Q. C.; Peng, W. H.; Wang, A. Q.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():1427-1438

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3368

Machine reading comprehension (MRC) has achieved significant progress on the open domain in recent years, mainly due to large-scale pre-trained language models. However, it performs much worse in specific domains such as the medical field due to the lack of extensive training data and professional structural knowledge neglect. As an effort, we first collect a large scale medical multi-choice question dataset (more than 21k instances) for the National Licensed Pharmacist Examination in China. It is a challenging medical examination with a passing rate of less than 14.2% in 2018. Then we propose a novel reading comprehension model KMQA, which can fully exploit the structural medical knowledge (i.e., medical knowledge graph) and the reference medical plain text (i.e., text snippets retrieved from reference books). The experimental results indicate that the KMQA outperforms existing competitive models with a large margin and passes the exam with 61.8% accuracy rate on the test set.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#394 - Li 2024
Joint extraction of Chinese medical entities and relations based on RoBERTa and single-module global pointer

Li, D. M.; Yang, Y.; Cui, J. M.; Meng, X. H.; Qu, J. T.; Jiang, Z. B.; Zhao, Y. F.

BMC Med. Inform. Decis. Mak. 2024;24(1):12

2024

DOI: 10.1186/s12911-024-02577-1 · Ref ID: 3563

BackgroundMost Chinese joint entity and relation extraction tasks in medicine involve numerous nested entities, overlapping relations, and other challenging extraction issues. In response to these problems, some traditional methods decompose the joint extraction task into multiple steps or multiple modules, resulting in local dependency in the meantime.MethodsTo alleviate this issue, we propose a joint extraction model of Chinese medical entities and relations based on RoBERTa and single-module global pointer, namely RSGP, which formulates joint extraction as a global pointer linking problem. Considering the uniqueness of Chinese language structure, we introduce the RoBERTa-wwm pre-trained language model at the encoding layer to obtain a better embedding representation. Then, we represent the input sentence as a third-order tensor and score each position in the tensor to prepare for the subsequent process of decoding the triples. In the end, we design a novel single-module global pointer decoding approach to alleviate the generation of redundant information. Specifically, we analyze the decoding process of single character entities individually, improving the time and space performance of RSGP to some extent.ResultsIn order to verify the effectiveness of our model in extracting Chinese medical entities and relations, we carry out the experiments on the public dataset, CMeIE. Experimental results show that RSGP performs significantly better on the joint extraction of Chinese medical entities and relations, and achieves state-of-the-art results compared with baseline models.ConclusionThe proposed RSGP can effectively extract entities and relations from Chinese medical texts and help to realize the structure of Chinese medical texts, so as to provide high-quality data support for the construction of Chinese medical knowledge graphs.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1570 - Li 2023
Large Language Models with Controllable Working Memory

Li, D.; Rawat, A. S.; Zaheer, M.; Wang, X.; Lukasik, M.; Veit, A.; Yu, F.; Kumar, S.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():1774-1793

Association for Computational Linguistics (ACL) 2023

Ref ID: 5209

Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), partly owing to the massive amounts of world knowledge they memorize during pretraining. While many downstream applications provide the model with an informational context to aid its underlying task, how the model's world knowledge interacts with the factual information presented in the context remains under explored. As a desirable behavior, an LLM should give precedence to the context whenever it contains task-relevant information that conflicts with the model's memorized knowledge. This enables model predictions to be grounded in the context, which then facilitates updating specific model predictions without frequently retraining the model. By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge. In this paper, we undertake a first joint study of the aforementioned two properties, namely controllability and robustness, in the context of LLMs. We demonstrate that state-of-the-art T5 and PaLM models (both pretrained and finetuned) could exhibit low controllability and robustness that does not improve with increasing the model size. As a solution, we propose a simple yet effective method - knowledge aware finetuning (KAFT) - to strengthen both controllability and robustness by injecting counterfactual and irrelevant contexts to standard supervised datasets. Our comprehensive evaluation showcases the utility of KAFT across model architectures and sizes. © 2023 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1465 - Li 2024
KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning

Li, D.; Zhang, T.; Huang, L.; Wang, C.; He, X.; Xue, H.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9693-9704

European Language Resources Association (ELRA) 2024

Ref ID: 4510

Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs) and integrate these external data sources into language models via self-supervised learning. Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration. In this paper, we propose to learn Knowledge-Enhanced language representations with Hierarchical Reinforcement Learning (KEHRL), which jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge. Specifically, a high-level reinforcement learning (RL) agent utilizes both internal and prior knowledge to iteratively detect essential positions in texts for knowledge injection, which filters out less meaningful entities to avoid diverting the knowledge learning direction. Once the entity positions are selected, a relevant triple filtration module is triggered to perform low-level RL to dynamically refine the triples associated with polysemic entities through binary-valued actions. Experiments validate KEHRL's effectiveness in probing factual knowledge and enhancing the model's performance on various natural language understanding tasks. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#586 - Li 2023
Multi-task Pre-training Language Model for Semantic Network Completion

Li, D.; Zhu, B. Q.; Yang, S.; Xu, K. L.; Yi, M.; He, Y. K.; Wang, H. M.

ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2023;22(11):19

2023

DOI: 10.1145/3627704 · Ref ID: 3105

Semantic networks, exemplified by the knowledge graph, serve as a means to represent knowledge by leveraging the structure of a graph. While the knowledge graph exhibits promising potential in the field of natural language processing, it suffers fromincompleteness. This article focuses on the task of completing knowledge graphs by predicting linkages between entities, which is fundamental yet critical. Traditional methods based on translational distance struggle when dealing with unseen entities. In contrast, semantic matching presents itself as a potential solution due to its ability to handle such cases. However, semantic matching-based approaches necessitate large-scale datasets for effective training, which are typically unavailable in practical scenarios, hindering their competitive performance. To address this challenge, we propose a novel architecture for knowledge graphs known as LP-BERT, which incorporates a language model. LP-BERT consists of two primary stages: multi-task pre-training and knowledge graph fine-tuning. During the pre-training phase, the model acquires relationship information from triples by predicting either entities or relations through three distinct tasks. In the fine-tuning phase, we introduce a batch-based triple-style negative sampling technique inspired by contrastive learning. This method significantly increases the proportion of negative sampling while maintaining a nearly unchanged training time. Furthermore, we propose a novel data augmentation approach that leverages the inverse relationship of triples to enhance both the performance and robustness of the model. To demonstrate the effectiveness of our proposed framework, we conduct extensive experiments on three widely used knowledge graph datasets: WN18RR, FB15k-237, and UMLS. The experimental results showcase the superiority of our methods, with LP-BERT achieving state-of-the-art performance on the WN18RR and FB15k-237 datasets.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3202 - Li 2024
Automated Clinical Data Extraction with Knowledge Conditioned LLMs

Li, Diya; Kadav, Asim; Gao, Aijing; Li, Rui; Bourgon, Richard

arXiv 2024;():

2024

Ref ID: 8424

The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Our knowledge-conditioned approach also improves the accuracy and reliability of LLM outputs by addressing the extraction task in two stages: (i) lung lesion finding detection and primary structured field parsing, followed by (ii) further parsing of lesion description text into additional structured fields. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1449 - Li 2022
A Joint Extraction Strategy for Chinese Medical Text Based on Sequence Tagging

Li, J.; Ruan, D.

Proceedings - 2022 International Conference on Computer Engineering and Artificial Intelligence, ICCEAI 2022 2022;():6-10

Institute of Electrical and Electronics Engineers Inc. 2022

DOI: 10.1109/ICCEAI55464.2022.00011 · Ref ID: 5531

The research on entities and relations extraction in medical text is the basis of constructing medical knowledge graphs. Currently the mainstream pipelined extraction method do not consider the connection between entity recognition and relation classification, and could not address the problem of the overlapping relations among the triplets. This paper proposes a joint extraction strategy of entities and relations in chinese medical based on sequence tagging, which splits the joint extraction task into two sequence tagging subtasks, namely HE and TRE, establishing the connection of subtasks through shared encoding layer and semantic information of head entity. By incorporating the pre-Trained language model RoBERTa to obtain a richer numerical representations of word vectors, then fusing word vectors and part-of-speech vectors as inputs of word representation for joint extraction, in combination with the GRU-BiLSTM model to extract entities and relations directly. Experimental results show that this model achieves 54.44% F-value on the chinese medical dataset CMeIE, which outperforms the extraction performance of other pre-Trained language models. © 2022 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1512 - Li 2023
Knowledge graph representation learning model combining entity description and path information

Li, J.; Wu, Y.; Wang, H.; Li, Z.; Xu, J.

CAAI. Trans. Intell. Syst. 2023;18(1):153-161

2023

DOI: 10.11992/tis.202112010 · Ref ID: 5230

Knowledge graph representation learning is a process of representing knowledge graph entities and relations in a multidimensional vector through specific rules. Existing representation learning methods are mostly used to solve the single-hop knowledge graph question-and-answer task, but their multi-hop reasoning ability cannot meet the actual demand. To improve the multi-hop reasoning ability, a knowledge graph representation learning model combining entity description and path information is proposed. First, the learning vector of entity and relation representation is obtained using the pre-training language model RoBERTa. Second, OPTransE is used to transform the knowledge graph into a vector integrating the path information of an ordered relation. Finally, the total energy function is constructed to fuse the vectors of entity description and path information. The feasibility and validity of the model are verified by comparing its performance in a link prediction task with that of the mainstream knowledge graph representation learning model. © 2023, Editorial Department of CAAI Transactions on Intelligent Systems. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#277 - Li 2021
Few-shot Knowledge Graph-to-Text Generation with Pretrained Language Models

Li, J. Y.; Tang, T. Y.; Zhao, W. X.; Wei, Z. C.; Yuan, N. J.; Wen, J. R.

Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():1558-1568

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 3026

This paper studies how to automatically generate a natural language text that describes the facts in knowledge graph (KG). Considering the few-shot setting, we leverage the excellent capacities of pretrained language models (PLMs) in language understanding and generation. We make three major technical contributions, namely representation alignment for bridging the semantic gap between KG encodings and PLMs, relation-biased KG linearization for deriving better input representations, and multi-task learning for learning the correspondence between KG and text. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our model on KG-to-text generation task. In particular, our model outperforms all comparison methods on both fully-supervised and few-shot settings. Our code and datasets are available at https://github.com/RUCAIBox/ Few-Shot-KG2Text.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1145 - Li 2024
COSIGN: Contextual Facts Guided Generation for Knowledge Graph Completion

Li, J.; Yu, H.; Luo, X.; Liu, Q.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():1669-1682

Association for Computational Linguistics (ACL) 2024

Ref ID: 4480

Knowledge graph completion (KGC) aims to infer missing facts based on existing facts within a KG. Recently, research on generative models (GMs) has addressed the limitations of embedding methods in terms of generality and scalability. However, GM-based methods are sensitive to contextual facts on KG, so the contextual facts of poor quality can cause GMs to generate erroneous results. To improve the performance of GM-based methods for various KGC tasks, we propose a COntextual FactS GuIded GeneratioN (COSIGN) model. First, to enhance the inference ability of the generative model, we designed a contextual facts collector to achieve human-like retrieval behavior. Second, a contextual facts organizer is proposed to learn the organized capabilities of LLMs through knowledge distillation. Finally, the organized contextual facts as the input of the inference generator to generate missing facts. Experimental results demonstrate that COSIGN outperforms state-of-the-art baseline techniques in terms of performance. ©2024 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3659 - Li 2024
LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning

Li, Jiachun; Cao, Pengfei; Wang, Chenhao; Jin, Zhuoran; Chen, Yubo; Liu, Kang; Jiang, Xiaojian; Xu, Jiexin; Zhao, Jun

arXiv 2024;():

2024

Ref ID: 8697

Large language models (LLMs) sometimes demonstrate poor performance on knowledge-intensive tasks, commonsense reasoning is one of them. Researchers typically address these issues by retrieving related knowledge from knowledge graphs or employing self-enhancement methods to elicit knowledge in LLMs. However, noisy knowledge and invalid reasoning issues hamper their ability to answer questions accurately. To this end, we propose a novel method named eliciting, filtering and integrating knowledge in large language model (LINKED). In it, we design a reward model to filter out the noisy knowledge and take the marginal consistent reasoning module to reduce invalid reasoning. With our comprehensive experiments on two complex commonsense reasoning benchmarks, our method outperforms SOTA baselines (up to 9.0% improvement of accuracy). Besides, to measure the positive and negative impact of the injected knowledge, we propose a new metric called effectiveness-preservation score for the knowledge enhancement works. Finally, through extensive experiments, we conduct an in-depth analysis and find many meaningful conclusions about LLMs in commonsense reasoning tasks.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3577 - Li 2024
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning

Li, Jiaqi; Tang, Yixuan; Yang, Yi

arXiv 2024;():

2024

Ref ID: 8383

Large language models (LLMs) have demonstrated remarkable capabilities but still face challenges such as hallucinations. One potential reason for hallucinations is the lack of relevant knowledge or context. Thus, a promising solution involves instructing LLMs to respond with "I do not know" when a question falls outside their knowledge domain or the provided context. However, in this work, we observed that LLMs struggle to admit their lack of knowledge, primarily due to existing instruction datasets designed to encourage specific answers. To improve models' capability to recognize the boundaries of their knowledge, we propose a novel approach called uncertainty-sensitive tuning. This method involves two-stage training designed for uncertainty recognition and prompt-sensitive activation. In the first stage, we guide the LLM to reject unknown questions. In the second stage, we force the model to follow the instructions by incorporating designed causal instructions. The experimental results demonstrate that our proposed uncertainty-sensitive tuning method enhance the model's ability to identify areas of uncertainty. Specifically, it achieves a substantial improvement of up to 34.7% in handling questions involving knowledge gaps compared to the original model. Moreover, our finetuned models even outperform GPT-4, exhibiting an overall performance improvement of up to 4.2%.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1532 - Li 2022
A Knowledge-Enhanced Model with Dual-Channel Encoder for Joint Entity and Relation Extraction from Biomedical Literature

Li, L.; Jiang, S.; Zhang, B.

Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022 2022;():773-776

Institute of Electrical and Electronics Engineers Inc. 2022

DOI: 10.1109/BIBM55620.2022.9995158 · Ref ID: 5466

Biomedical entity and relation extraction has attracted increasing attention recently, whereas it remains challenging due to its domain-specific features for the biomedical corpus. Hence, many researchers consider utilizing external knowledge from large-scale databases to enhance the semantic understanding of models. However, these knowledge-enhanced methods usually enrich context information by incorporating the context-independent knowledge into entity representations and lack effective interaction. Actually, inspired by pre-trained language models, we argue that knowledge representations need to be trainable and adapted for different contexts. Therefore, we propose Knowledge-enhanced Dual-channel Iterative Model (KeDcIM), a novel end-to-end joint model for biomedical entity and relation extraction. Experiments show that KeDcIM achieves new state-of-the-art results on two benchmark datasets. © 2022 IEEE.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1610 - Li 2024
LLM-based Multi-Level Knowledge Generation for Few-shot Knowledge Graph Completion

Li, Q.; Chen, Z.; Ji, C.; Jiang, S.; Li, J.

IJCAI International Joint Conference on Artificial Intelligence 2024;():2135-2143

International Joint Conferences on Artificial Intelligence 2024

Ref ID: 4384

Knowledge Graphs (KGs) are pivotal in various NLP applications but often grapple with incompleteness, especially due to the long-tail problem where infrequent, unpopular relationships drastically reduce the KG completion performance.In this paper, we focus on Few-shot Knowledge Graph Completion (FKGC), a task addressing these gaps in long-tail scenarios.Amidst the rapid evolution of Large Language Models, we propose a generation-based FKGC paradigm facilitated by LLM distillation.Our MuKDC framework employs multi-level knowledge distillation for few-shot KG completion, generating supplementary knowledge to mitigate data scarcity in few-shot environments.MuKDC comprises two primary components: Multi-level Knowledge Generation, which enriches the KG at various levels, and Consistency Assessment, to ensure the coherence and reliability of the generated knowledge.Most notably, our method achieves SOTA results in both FKGC and multi-modal FKGC benchmarks, significantly advancing KG completion and enhancing the understanding and application of LLMs in structured knowledge generation and assessment. © 2024 International Joint Conferences on Artificial Intelligence. All rights reserved.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#249 - Li 2025
Explainable reasoning over temporal knowledge graphs by pre-trained language model

Li, Q.; Wu, G. Z.

Inf. Process. Manage. 2025;62(1):15

2025

DOI: 10.1016/j.ipm.2024.103903 · Ref ID: 3196

Temporal knowledge graph reasoning (TKGR) has been considered as a crucial task for modeling the evolving knowledge, aiming to infer the unknown connections between entities at specific times. Traditional TKGR methods try to aggregate structural information between entities and evolve representations of entities over distinct snapshots, while some other methods attempt to extract temporal logic rules from historical interactions. However, these methods fail to address the continuously emerging unseen entities over time and ignore the historical dependencies between entities and relations. To overcome these limitations, we propose a novel method, termed TPNet, which introduces historical information completion strategy (HICS) and pre-trained language model (PLM) to conduct explainable inductive reasoning over TKGs. Specifically, TPNet extracts reliable temporal logical paths from historical subgraphs using a temporal-correlated search strategy. For unseen entities, we utilize HICS to sample or generate paths to supplement their historical information. Besides, a PLM and a time-aware encoder are introduced to jointly encode the temporal paths, thereby comprehensively capturing dependencies between entities and relations. Moreover, the semantic similarity between the query quadruples and the extracted paths is evaluated to simultaneously optimize the representations of entities and relations. Extensive experiments on entity and relation prediction tasks are conducted to evaluate the performance of TPNet. The experimental results on four benchmark datasets demonstrate the superiority of TPNet over state-of-the-art TKGR methods, achieving improvements of 14.35%, 23.08%, 6.75% and 5.38% on MRR, respectively.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1360 - Li 2023
Graph Reasoning for Question Answering with Triplet Retrieval

Li, S.; Gao, Y.; Jiang, H.; Yin, Q.; Li, Z.; Yan, X.; Zhang, C.; Yin, B.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():3366-3375

Association for Computational Linguistics (ACL) 2023

Ref ID: 5119

Answering complex questions often requires reasoning over knowledge graphs (KGs). State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e.g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering. However, this paradigm constrains retrieved knowledge in local subgraphs and discards more diverse triplets buried in KGs that are disconnected but useful for question answering. In this paper, we propose a simple yet effective method to first retrieve the most relevant triplets from KGs and then rerank them, which are then concatenated with questions to be fed into language models. Extensive results on both CommonsenseQA and OpenbookQA datasets show that our method can outperform state-of-the-art up to 4.6% absolute accuracy. © 2023 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1419 - Li 2022
Instilling Type Knowledge in Language Models via Multi-Task QA

Li, S.; Sridhar, M.; Prakash, C. S.; Cao, J.; Hamza, W.; McAuley, J.

Findings of the Association for Computational Linguistics: NAACL 2022 - Findings 2022;():594-603

Association for Computational Linguistics (ACL) 2022

Ref ID: 5589

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge-their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to theWikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-ofthe- art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#449 - Li 2023
Knowledge Graph-Based Credibility Evaluation Method for Electric Grid Large Language Model Knowledge Question-Answering

Li, W. Q.; Qi, X. M.; Zhao, Q.; Wang, C.; Wu, Q. Y.; Tang, X. S.; Assoc Computing, Machinery

7th International Conference on Electronic Information Technology and Computer Engineering (EITCE) 2023;():754-759

Xiamen, PEOPLES R CHINA Assoc Computing Machinery 2023

DOI: 10.1145/3650400.3650526 · Ref ID: 2946

In the field of electricity, specialized terminology is often intricate and complex, making it challenging for non-experts to comprehend. However, with the advancement of artificial intelligence technology, the emergence of large language models provides a new technological solution to address this issue. Large language models, based on deep learning techniques, have the capability to quickly understand and interpret specialized terminology in the electricity domain through learning from a vast corpus of professional literature and data. They can then be applied to various domains, including question-answering systems. However, existing large language models still face issues of unreliable outputs, necessitating a method to evaluate their results and improve the quality of their applications. We propose a knowledge graph-based credibility evaluation method for electric grid large language model knowledge question-answering. This method aligns the answers generated by large language models with the knowledge graph of a local knowledge base and calculates their cosine similarity and Pearson correlation coefficient. We batch-process the answers from the large language model into an electricity dataset and validate them using this method. Experimental results demonstrate that this method can accurately and efficiently reflect the relevance between texts, providing a reliable scoring basis for question-answering by large models in vertical domains. Future research can focus on exploring other embedding methods that can better extract semantic relationships between texts and validating the feasibility of this method in vertical domains other than electricity.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1982 - Li 2024
Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution

Li, X.; Cao, Y.; Pan, L.; Ma, Y.; Sun, A.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():493-516

Association for Computational Linguistics (ACL) 2024

Ref ID: 4387

Although achieving great success, Large Language Models (LLMs) usually suffer from unreliable hallucinations. Although language attribution can be a potential solution, there are no suitable benchmarks and evaluation metrics to attribute LLMs to structured knowledge. In this paper, we define a new task of Knowledge-aware Language Model Attribution (KaLMA) that improves upon three core concerns with conventional attributed LMs. First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios. Second, we propose a new “Conscious Incompetence" setting considering the incomplete knowledge repository, where the model identifies the need for supporting knowledge beyond the provided KG. Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment. To implement the above innovations, we build a dataset in biography domain BioKaLMA via evolutionary question generation strategy, to control the question complexity and necessary knowledge to the answer. For evaluation, we develop a baseline solution and demonstrate the room for improvement in LLMs' citation generation, emphasizing the importance of incorporating the "Conscious Incompetence" setting, and the critical role of retrieval accuracy. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#239 - Li 2024
Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation

Li, X.; Henriksson, A.; Duneld, M.; Nouri, J.; Wu, Y. C.

Future Internet 2024;16(1):21

2024

DOI: 10.3390/fi16010012 · Ref ID: 2968

Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner-relying only on readily available pre-trained models-facilitates the development of AI-enhanced learning.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#657 - Li 2024
PMET: Precise Model Editing in a Transformer

Li, X. P.; Li, S. S.; Song, S. Z.; Yang, J.; Ma, J.; Yu, J.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18564-18572

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3773

Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at \url{https://github.com/xpq-tech/PMET}.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1522 - Li 2024
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation

Li, X.; Song, L.; Jin, L.; Mi, H.; Ouyang, J.; Yu, D.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():666-676

European Language Resources Association (ELRA) 2024

Ref ID: 4541

Knowledge-based, open-domain dialogue generation aims to build chit-chat systems that talk to humans using mined support knowledge. Many types and sources of knowledge have previously been shown to be useful as support knowledge. Even in the era of large language models, response generation grounded in knowledge retrieved from additional up-to-date sources remains a practically important approach. While prior work using single-source knowledge has shown a clear positive correlation between the performances of knowledge selection and response generation, there are no existing multi-source datasets for evaluating support knowledge retrieval. Further, prior work has assumed that the knowledge sources available at test time are the same as during training. This unrealistic assumption unnecessarily handicaps models, as new knowledge sources can become available after a model is trained. In this paper, we present a high-quality benchmark named multi-source Wizard of Wikipedia (Ms.WoW) for evaluating multi-source dialogue knowledge selection and response generation. Unlike existing datasets, it contains clean support knowledge, grounded at the utterance level and partitioned into multiple knowledge sources. We further propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources in a zero-shot fashion. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3130 - Li 2024
FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets

Li, Xiaohui Victor; Passino, Francesco Sanna

Proceedings of the 5th ACM International Conference on AI in Finance 2024;():573–581

Brooklyn, NY, USA Association for Computing Machinery 2024

DOI: 10.1145/3677052.3698603 · Ref ID: 7244

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3898 - Li 2024
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

Li, Xiaopeng; Li, Shasha; Song, Shezheng; Liu, Huijun; Ji, Bin; Wang, Xi; Ma, Jun; Yu, Jie; Liu, Xiaodong; Wang, Jing; Zhang, Weimin

arXiv 2024;():

2024

Ref ID: 8058

The general capabilities of large language models (LLMs) make them the infrastructure for various AI applications, but updating their inner knowledge requires significant resources. Recent model editing is a promising technique for efficiently updating a small amount of knowledge of LLMs and has attracted much attention. In particular, local editing methods, which directly update model parameters, are more suitable for updating a small amount of knowledge. Local editing methods update weights by computing least squares closed-form solutions and identify edited knowledge by vector-level matching in inference, which achieve promising results. However, these methods still require a lot of time and resources to complete the computation. Moreover, vector-level matching lacks reliability, and such updates disrupt the original organization of the model's parameters. To address these issues, we propose an detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching and adds them to the subject word embeddings in Transformer input. To get these editing embeddings, we propose optimizing then suppressing fusion method, which first optimizes learnable embedding vectors for the editing target and then suppresses the Knowledge Embedding Dimensions (KEDs) to obtain final editing embeddings. We thus propose SWEA$\oplus$OS method for editing factual knowledge in LLMs. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$\oplus$OS on the \textsc{CounterFact} and zsRE datasets. To further validate the reasoning ability of SWEA$\oplus$OS in editing knowledge, we evaluate it on the more complex \textsc{RippleEdits} benchmark. The results demonstrate that SWEA$\oplus$OS possesses SOTA reasoning ability.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3248 - Li 2024
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks

Li, Xingxuan; Xu, Weiwen; Zhao, Ruochen; Jiao, Fangkai; Joty, Shafiq; Bing, Lidong

arXiv 2024;():

2024

Ref ID: 8649

State-of-the-art large language models (LLMs) exhibit impressive problem-solving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought and retrieval-augmented generation (RAG) to decompose a complex problem into simpler steps and apply retrieval to improve factual correctness. These methods work well on straightforward reasoning tasks but often falter on challenging tasks such as competitive programming and mathematics, due to frequent reasoning errors and irrelevant knowledge retrieval. To address this, we introduce Critic-guided planning with Retrieval-augmentation, CR-Planner, a novel framework that leverages fine-tuned critic models to guide both reasoning and retrieval processes through planning. CR-Planner solves a problem by iteratively selecting and executing sub-goals. Initially, it identifies the most promising sub-goal from reasoning, query generation, and retrieval, guided by rewards given by a critic model named sub-goal critic. It then executes this sub-goal through sampling and selecting the optimal output based on evaluations from another critic model named execution critic. This iterative process, informed by retrieved information and critic models, enables CR-Planner to effectively navigate the solution space towards the final answer. We employ Monte Carlo Tree Search to collect the data for training the critic models, allowing for a systematic exploration of action sequences and their long-term impacts. We validate CR-Planner on challenging domain-knowledge-intensive and reasoning-heavy tasks, including competitive programming, theorem-driven math reasoning, and complex domain retrieval problems. Our experiments demonstrate that CR-Planner significantly outperforms baselines, highlighting its effectiveness in addressing challenging problems by improving both reasoning and retrieval.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3624 - Li 2024
Large Language Model Agent for Fake News Detection

Li, Xinyi; Zhang, Yongfeng; Malthouse, Edward C.

arXiv 2024;():

2024

Ref ID: 8271

In the current digital era, the rapid spread of misinformation on online platforms presents significant challenges to societal well-being, public trust, and democratic processes, influencing critical decision making and public opinion. To address these challenges, there is a growing need for automated fake news detection mechanisms. Pre-trained large language models (LLMs) have demonstrated exceptional capabilities across various natural language processing (NLP) tasks, prompting exploration into their potential for verifying news claims. Instead of employing LLMs in a non-agentic way, where LLMs generate responses based on direct prompts in a single shot, our work introduces FactAgent, an agentic approach of utilizing LLMs for fake news detection. FactAgent enables LLMs to emulate human expert behavior in verifying news claims without any model training, following a structured workflow. This workflow breaks down the complex task of news veracity checking into multiple sub-steps, where LLMs complete simple tasks using their internal knowledge or external tools. At the final step of the workflow, LLMs integrate all findings throughout the workflow to determine the news claim's veracity. Compared to manual human verification, FactAgent offers enhanced efficiency. Experimental studies demonstrate the effectiveness of FactAgent in verifying claims without the need for any training process. Moreover, FactAgent provides transparent explanations at each step of the workflow and during final decision-making, offering insights into the reasoning process of fake news detection for end users. FactAgent is highly adaptable, allowing for straightforward updates to its tools that LLMs can leverage within the workflow, as well as updates to the workflow itself using domain knowledge. This adaptability enables FactAgent's application to news verification across various domains.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1543 - Li 2024
KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Li, Y.; Huang, C.; Deng, S.; Lock, M. L.; Cao, T.; Oo, N.; Lim, H. W.; Hooi, B.

Proceedings of the 33rd USENIX Security Symposium 2024;():793-810

USENIX Association 2024

Ref ID: 4292

Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a Large Language Model (LLM)-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines. © USENIX Security Symposium 2024.All rights reserved.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#473 - Li 2022
Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Li, Y.; Peng, B. L.; Shen, Y. L.; Mao, Y.; Liden, L.; Yu, Z.; Gao, J. F.; Assoc Computat, Linguist

Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():206-218

Seattle, WA Assoc Computational Linguistics-Acl 2022

Ref ID: 3316

Knowledge-grounded dialogue systems are challenging to build due to the lack of training data and heterogeneous knowledge sources. Existing systems perform poorly on unseen topics due to limited topics covered in the training data. In addition, it is challenging to generalize to the domains that require different types of knowledge sources. To address the above challenges, we present PLUG(1), a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks. We first retrieve relevant information from heterogeneous knowledge sources (e.g., wiki, dictionary, or knowledge graph); Then the retrieved knowledge is transformed into text and concatenated with dialogue history to feed into the language model for generating responses. PLUG is pre-trained on a large-scale knowledge-grounded dialogue corpus. The empirical evaluation on two benchmarks shows that PLUG generalizes well across different knowledge-grounded dialogue tasks. It achieves comparable performance with state-of-the-art methods in the fully-supervised setting and significantly outperforms other approaches in zero-shot and few-shot settings.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#77 - Li 2024
Building a knowledge graph to enrich ChatGPT responses in manufacturing service discovery

Li, Y. Q.; Starly, B.

J. Ind. Inf. Integr. 2024;40():15

2024

DOI: 10.1016/j.jii.2024.100612 · Ref ID: 3025

Sourcing and identification of new manufacturing partners is crucial for manufacturing system integrators to enhance agility and reduce risk through supply chain diversification in the global economy. The advent of advanced large language models has captured significant interest, due to their ability to generate comprehensive and articulate responses across a wide range of knowledge domains. However, the system often falls short in accuracy and completeness when responding to domain-specific inquiries, particularly in areas like manufacturing service discovery. This research explores the potential of leveraging Knowledge Graphs in conjunction with ChatGPT to streamline the process for prospective clients in identifying small manufacturing enterprises. In this study, we propose a method that integrates bottom-up ontology with advanced machine learning models to develop a Manufacturing Service Knowledge Graph from an array of structured and unstructured data sources, including the digital footprints of small-scale manufacturers throughout North America. The Knowledge Graph and the learned graph embedding vectors are leveraged to tackle intricate queries within the digital supply chain network, responding with enhanced reliability and greater interpretability. The approach highlighted is scalable to millions of entities that can be distributed to form a global Manufacturing Service Knowledge Network Graph that can potentially interconnect multiple types of Knowledge Graphs that span industry sectors, geopolitical boundaries, and business domains. The dataset developed for this study, now publicly accessible, encompasses more than 13,000 manufacturers' weblinks, manufacturing services, certifications, and location entity types.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1247 - Li 2024
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration

Li, Y.; Zhang, R.; Liu, J.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;15020 LNCS():251-265

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-3-031-72344-5_17 · Ref ID: 4139

While Large Language Models (LLMs) demonstrate exceptional performance in a multitude of Natural Language Processing (NLP) tasks, they encounter challenges in practical applications, including issues with hallucinations, inadequate knowledge updating, and limited transparency in the reasoning process. To overcome these limitations, this study innovatively proposes a collaborative training-free reasoning scheme involving tight cooperation between Knowledge Graph (KG) and LLMs. This scheme first involves using LLMs to iteratively explore KG, selectively retrieving a task-relevant knowledge subgraph to support reasoning. The LLMs are then guided to further combine inherent implicit knowledge to reason on the subgraph while explicitly elucidating the reasoning process. Through such a cooperative approach, our scheme achieves more reliable knowledge-based reasoning and facilitates the tracing of the reasoning results. Experimental results show that our scheme significantly progressed across multiple datasets, notably achieving an improvement of over 10% on the QALD10 dataset compared to both the best baseline and the fine-tuned state-of-the-art (SOTA) models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#814 - Li 2024
Text-enhanced knowledge graph representation learning with local structure

Li, Z. F.; Jian, Y.; Xue, Z. C.; Zheng, Y. M.; Zhang, M.; Zhang, Y.; Hou, X. J.; Wang, X. G.

Inf. Process. Manage. 2024;61(5):16

2024

DOI: 10.1016/j.ipm.2024.103797 · Ref ID: 3005

Knowledge graph representation learning entails transforming entities and relationships within a knowledge graph into vectors to enhance downstream tasks. The rise of pre -trained language models has recently promoted text -based approaches for knowledge graph representation learning. However, these methods often need more structural information on knowledge graphs, prompting the challenge of integrating graph structure knowledge into text -based methodologies. To tackle this issue, we introduce a text -enhanced model with local structure (TEGS) that embeds local graph structure details from the knowledge graph into the text encoder. TEGS integrates k -hop neighbor entity information into the text encoder and employs a decoupled attention mechanism to blend relative position encoding and text semantics. This strategy augments learnable content through graph structure information and mitigates the impact of semantic ambiguity via the decoupled attention mechanism. Experimental findings demonstrate TEGS's effectiveness at fusing graph structure information, resulting in state-ofthe-art performance across three datasets in link prediction tasks. In terms of Hit@1, when compared to the previous text -based models, our model demonstrated improvements of 2.1% on WN18RR, 2.4% on FB15k-237, and 2.7% on the NELL-One dataset. Our code is made publicly available on

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#583 - Li 2023
Multi-Hop Question Generation with Knowledge Graph-Enhanced Language Model

Li, Z. P.; Cao, Z.; Li, P. F.; Zhong, Y.; Li, S. B.

Appl. Sci.-Basel 2023;13(9):15

2023

DOI: 10.3390/app13095765 · Ref ID: 2962

The task of multi-hop question generation (QG) seeks to generate questions that require a complex reasoning process that spans multiple sentences and answers. Beyond the conventional challenges of what to ask and how to ask, multi-hop QG necessitates sophisticated reasoning from dispersed evidence across multiple sentences. To address these challenges, a knowledge graph-enhanced language model (KGEL) has been developed to imitate human reasoning for multi-hop questions.The initial step in KGEL involves encoding the input sentence with a pre-trained GPT-2 language model to obtain a comprehensive semantic context representation. Next, a knowledge graph is constructed using the entities identified within the context. The critical information in the graph that is related to the answer is then utilized to update the context representations through an answer-aware graph attention network (GAT). Finally, the multi-head attention generation module (MHAG) is performed over the updated latent representations of the context to generate coherent questions. Human evaluations demonstrate that KGEL generates more logical and fluent multi-hop questions compared to GPT-2. Furthermore, KGEL outperforms five prominent baselines in automatic evaluations, with a BLEU-4 score that is 27% higher than that of GPT-2.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#229 - Li 2024
Ensemble pretrained language models to extract biomedical knowledge from literature

Li, Z.; Wei, Q.; Huang, L. C.; Li, J. F.; Hu, Y.; Chuang, Y. S.; He, J. P.; Das, A.; Keloth, V. K.; Yang, Y. T.; Diala, C. S.; Roberts, K. E.; Tao, C.; Jiang, X. Q.; Zheng, W. J.; Xu, H.

J. Am. Med. Inf. Assoc. 2024;31(9):1904-1911

2024

DOI: 10.1093/jamia/ocae061 · Ref ID: 3505

Objectives The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases and highlight research deficiencies. The LitCoin Natural Language Processing (NLP) challenge, organized by the National Center for Advancing Translational Science, aims to evaluate such potential and provides a manually annotated corpus for methodology development and benchmarking.Materials and Methods For the named entity recognition (NER) task, we utilized ensemble learning to merge predictions from three domain-specific models, namely BioBERT, PubMedBERT, and BioM-ELECTRA, devised a rule-driven detection method for cell line and taxonomy names and annotated 70 more abstracts as additional corpus. We further finetuned the T0pp model, with 11 billion parameters, to boost the performance on relation extraction and leveraged entites' location information (eg, title, background) to enhance novelty prediction performance in relation extraction (RE).Results Our pioneering NLP system designed for this challenge secured first place in Phase I-NER and second place in Phase II-relation extraction and novelty prediction, outpacing over 200 teams. We tested OpenAI ChatGPT 3.5 and ChatGPT 4 in a Zero-Shot setting using the same test set, revealing that our finetuned model considerably surpasses these broad-spectrum large language models.Discussion and Conclusion Our outcomes depict a robust NLP system excelling in NER and RE across various biomedical entities, emphasizing that task-specific models remain superior to generic large ones. Such insights are valuable for endeavors like knowledge graph development and hypothesis formulation in biomedical research.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1585 - Liang 2023
Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Liang, C.; Zuo, S.; Zhang, Q.; He, P.; Chen, W.; Zhao, T.

Proceedings of Machine Learning Research 2023;202():20852-20867

ML Research Press 2023

Ref ID: 5221

Layer-wise distillation is a powerful tool to compress large models (i.e. teacher models) into small ones (i.e., student models). The student distills knowledge from the teacher by mimicking the hidden representations of the teacher at every intermediate layer. However, layer-wise distillation is difficult. Since the student has a smaller model capacity than the teacher, it is often under-fitted. Furthermore, the hidden representations of the teacher contain redundant information that the student does not necessarily need for the target task's learning. To address these challenges, we propose a novel Task-aware layEr-wise Distillation (TED). TED designs task-aware filters to align the hidden representations of the student and the teacher at each layer. The filters select the knowledge that is useful for the target task from the hidden representations. As such, TED reduces the knowledge gap between the two models and helps the student to fit better on the target task. We evaluate TED in two scenarios: continual pre-training and fine-tuning. TED demonstrates significant and consistent improvements over existing distillation methods in both scenarios. Code is available at https://github.com/cliang1453/task-aware-distillation. © 2023 Proceedings of Machine Learning Research. All rights reserved.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#1374 - Liang 2023
Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining

Liang, J.; Ye, R.; Han, M.; Zhang, Q.; Lai, R.; Zhang, X.; Cao, Z.; Huang, X.; Wei, Z.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():14606-14620

Association for Computational Linguistics (ACL) 2023

Ref ID: 4955

The knowledge graph is a structure to store and represent knowledge, and recent studies have discussed its capability to assist language models for various applications. Some variations of knowledge graphs aim to record arguments and their relations for computational argumentation tasks. However, many must simplify semantic types to fit specific schemas, thus losing flexibility and expression ability. In this paper, we propose the Hierarchical Argumentation Graph (Hi-ArG), a new structure to organize arguments. We also introduce two approaches to exploit Hi-ArG, including a text-graph multi-modal model GreaseArG and a new pre-training framework augmented with graph information. Experiments on two argumentation tasks have shown that after further pre-training and fine-tuning, GreaseArG supersedes same-scale language models on these tasks, while incorporating graph information during further pre-training can also improve the performance of vanilla language models. Code for this paper is available at https://github.com/ljcleo/Hi-ArG. ©2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3562 - Liang 2024
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

Liang, Lei; Sun, Mengshu; Gui, Zhengke; Zhu, Zhongshu; Jiang, Zhouyu; Zhong, Ling; Qu, Yuan; Zhao, Peilong; Bo, Zhongpu; Yang, Jin; Xiong, Huaidong; Yuan, Lin; Xu, Jun; Wang, Zaoyang; Zhang, Zhiqiang; Zhang, Wen; Chen, Huajun; Chen, Wenguang; Zhou, Jun

arXiv 2024;():

2024

Ref ID: 8614

The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications. However, it also has limitations, including the gap between vector similarity and the relevance of knowledge reasoning, as well as insensitivity to knowledge logic, such as numerical values, temporal relations, expert rules, and others, which hinder the effectiveness of professional knowledge services. In this work, we introduce a professional domain knowledge service framework called Knowledge Augmented Generation (KAG). KAG is designed to address the aforementioned challenges with the motivation of making full use of the advantages of knowledge graph(KG) and vector retrieval, and to improve generation and reasoning performance by bidirectionally enhancing large language models (LLMs) and KGs through five key aspects: (1) LLM-friendly knowledge representation, (2) mutual-indexing between knowledge graphs and original chunks, (3) logical-form-guided hybrid reasoning engine, (4) knowledge alignment with semantic reasoning, and (5) model capability enhancement for KAG. We compared KAG with existing RAG methods in multihop question answering and found that it significantly outperforms state-of-theart methods, achieving a relative improvement of 19.6% on 2wiki and 33.5% on hotpotQA in terms of F1 score. We have successfully applied KAG to two professional knowledge Q&amp;A tasks of Ant Group, including E-Government Q&amp;A and E-Health Q&amp;A, achieving significant improvement in professionalism compared to RAG methods.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3240 - Liang 2023
C5: Towards Better Conversation Comprehension and Contextual Continuity for ChatGPT

Liang, Pan; Ye, Danwei; Zhu, Zihao; Wang, Yunchao; Xia, Wang; Liang, Ronghua; Sun, Guodao

arXiv 2023;():

2023

Ref ID: 7799

Large language models (LLMs), such as ChatGPT, have demonstrated outstanding performance in various fields, particularly in natural language understanding and generation tasks. In complex application scenarios, users tend to engage in multi-turn conversations with ChatGPT to keep contextual information and obtain comprehensive responses. However, human forgetting and model contextual forgetting remain prominent issues in multi-turn conversation scenarios, which challenge the users' conversation comprehension and contextual continuity for ChatGPT. To address these challenges, we propose an interactive conversation visualization system called C5, which includes Global View, Topic View, and Context-associated Q&amp;A View. The Global View uses the GitLog diagram metaphor to represent the conversation structure, presenting the trend of conversation evolution and supporting the exploration of locally salient features. The Topic View is designed to display all the question and answer nodes and their relationships within a topic using the structure of a knowledge graph, thereby display the relevance and evolution of conversations. The Context-associated Q&amp;A View consists of three linked views, which allow users to explore individual conversations deeply while providing specific contextual information when posing questions. The usefulness and effectiveness of C5 were evaluated through a case study and a user study.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1581 - Liang 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation

Liang, Y.; Song, Z.; Wang, H.; Zhang, J.

KnowledgeNLP 2024 - 3rd Workshop on Knowledge Augmented Methods for NLP, Proceedings of the Workshop 2024;():44-58

Association for Computational Linguistics (ACL) 2024

Ref ID: 4283

We evaluate the ability of Large Language Models (LLMs) to discern and express their internal knowledge state, a key factor in countering factual hallucination and ensuring reliable application of LLMs. We observe a robust self-awareness of internal knowledge state in LLMs, evidenced by over 85% accuracy in knowledge state probing. However, LLMs often fail to faithfully express their internal knowledge during generation, leading to factual hallucinations. We develop an automated hallucination annotation tool, DreamCatcher, which merges knowledge probing and consistency checking methods to rank factual preference data. Using knowledge preference as reward, We propose a Reinforcement Learning from Knowledge Feedback (RLKF) training framework, leveraging reinforcement learning to enhance the factuality and honesty of LLMs. Our experiments across multiple models show that RLKF training effectively enhances the ability of models to utilize their internal knowledge state, boosting performance in a variety of knowledge-based and honesty-related tasks. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#20 - Ligabue 2024
Applying a Context-based Method to Build a Knowledge Graph for the Blue Amazon

Ligabue, P. D.; Brandao, A. A. F.; Peres, S. M.; Cozman, F. G.; Pirozelli, P.

Data Intell. 2024;6(1):64-103

2024

DOI: 10.1162/dint_a_00223 · Ref ID: 3047

Knowledge graphs are employed in several tasks, such as question answering and recommendation systems, due to their ability to represent relationships between concepts. Automatically constructing such a graphs, however, remains an unresolved challenge within knowledge representation. To tackle this challenge, we propose CtxKG, a method specifically aimed at extracting knowledge graphs in a context of limited resources in which the only input is a set of unstructured text documents. CtxKG is based on OpenIE (a relationship triple extraction method) and BERT (a language model) and contains four stages: the extraction of relationship triples directly from text; the identification of synonyms across triples; the merging of similar entities; and the building of bridges between knowledge graphs of different documents. Our method distinguishes itself from those in the current literature (i) through its use of the parse tree to avoid the overlapping entities produced by base implementations of OpenIE; and (ii) through its bridges, which create a connected network of graphs, overcoming a limitation similar methods have of one isolated graph per document. We compare our method to two others by generating graphs for movie articles from Wikipedia and contrasting them with benchmark graphs built from the OMDb movie database. Our results suggest that our method is able to improve multiple aspects of knowledge graph construction. They also highlight the critical role that triple identification and named-entity recognition have in improving the quality of automatically generated graphs, suggesting future paths for investigation. Finally, we apply CtxKG to build BlabKG, a knowledge graph for the Blue Amazon, and discuss possible improvements.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1663 - Lim 2024
Multilingual Question Answering for Malaysia History with Transformer-based Language Model

Lim, Q. Z.; Lee, C. P.; Lim, K. M.; Ng, J. X.; Ooi, E. K. H.; Loh, N. K. N.

Emerg. Sci. J. 2024;8(2):675-686

2024

DOI: 10.28991/ESJ-2024-08-02-019 · Ref ID: 4054

In natural language processing (NLP), a Question Answering System (QAS) refers to a system or model that is designed to understand and respond to user queries in natural language. As we navigate through the recent advancements in QAS, it can be observed that there is a paradigm shift of the methods used from traditional machine learning and deep learning approaches towards transformer-based language models. While significant progress has been made, the utilization of these models for historical QAS and the development of QAS for Malay language remain largely unexplored. This research aims to bridge the gaps, focusing on developing a Multilingual QAS for history of Malaysia by utilizing a transformer-based language model. The system development process encompasses various stages, including data collection, knowledge representation, data loading and pre-processing, document indexing and storing, and the establishment of a querying pipeline with the retriever and reader. A dataset with a collection of 100 articles, including web blogs related to the history of Malaysia, has been constructed, serving as the knowledge base for the proposed QAS. A significant aspect of this research is the use of the translated dataset in English instead of the raw dataset in Malay. This decision was made to leverage the effectiveness of well-established retriever and reader models that were trained on English data. Moreover, an evaluation dataset comprising 100 question-answer pairs has been created to evaluate the performance of the models. A comparative analysis of six different transformer-based language models, namely DeBERTaV3, BERT, ALBERT, ELECTRA, MiniLM, and RoBERTa, has been conducted, where the effectiveness of the models was examined through a series of experiments to determine the best reader model for the proposed QAS. The experimental results reveal that the proposed QAS achieved the best performance when employing RoBERTa as the reader model. Finally, the proposed QAS was deployed on Discord and equipped with multilingual support through the incorporation of language detection and translation modules, enabling it to handle queries in both Malay and English. © 2024, Ital Publication. All rights reserved.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#430 - Lin 2023
Knowledge Graph Completion for Power Grid Main Equipment Using Pretrained Language Models

Lin, C. X.; Zheng, Z.; Cai, S. T.; Fu, L.; Xie, W.; Ma, T.; Zhang, Z. H.

19th International Conference on Advanced Intelligent Computing Technology and Applications (ICIC) 2023;14089():828-838

Zhengzhou, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2023

DOI: 10.1007/978-981-99-4752-2_68 · Ref ID: 2938

The safe and stable operation of power systems relies on the timely diagnosis of defects in power grid equipment. To achieve this, knowledge graph (KG) can be used to model power grid equipment defect knowledge, and knowledge graph embedding (KGE) can be utilized to embed KG into low dimensional vector spaces for deep learning models. However, pre-trained language model-based KGE methods may not perform as well as structure-based methods due to their limitations in explicitly representing domain-specific knowledge and supplementary information about entities. In this study, a hybrid KGE model called PLMSM was proposed to address this issue. PLMSM combines pre-trained language models with structure-based models to input entities and their supplementary information into a pre-trained language model to obtain their embeddings, which are then combined with the embeddings generated by a structure-based model for entity completion tasks. The model was optimized through efficient negative sampling and addressed the issue of inaccurate predictions caused by long-tail entities in the power grid defects KG. The experimental results showed that PLMSM achieved good performance in Entity completion tasks on the power grid equipment defects KG. This proposed model has potential applications in power grid equipment defect diagnosis and maintenance.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2063 - Lin 2024
Entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration

Lin, J.; Li, S.; Qin, N.; Ding, S.

Math Biosci Eng 2024;21(1):1228-1248

2024

DOI: 10.3934/mbe.2024052 · Ref ID: 6004

The operation and maintenance of railway signal systems create a significant and complex quantity of text data about faults. Aiming at the problems of fuzzy entity boundaries and low accuracy of entity recognition in the field of railway signal equipment faults, this paper provides a method for entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration. First, the model utilizes the RoBERTa-wwm pretrained language model to get the word vector of text sequences. Second, a parallel network consisting of a BiLSTM and a CNN is constructed to obtain the context feature information and the local attention information, respectively. Third, the feature vectors output from BiLSTM and CNN are combined and fed into MHA, focusing on extracting key feature information and mining the connection between different features. Finally, the label sequences with constraint relationships are outputted in CRF to complete the entity recognition task. The experimental analysis is carried out with fault text of railway signal equipment in the past ten years, and the experimental results show that the model has a higher evaluation index compared with the traditional model on this dataset, in which the precision, recall and F(1) value are 93.25%, 92.45%, and 92.85%, respectively.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#3955 - Lin 2024
Unleashing the Power of LLMs as Multi-Modal Encoders for Text and Graph-Structured Data

Lin, Jiacheng; Qian, Kun; Han, Haoyu; Choudhary, Nurendra; Wei, Tianxin; Wang, Zhongruo; Genc, Sahika; Huang, Edward W.; Wang, Sheng; Subbian, Karthik; Koutra, Danai; Sun, Jimeng

arXiv 2024;():

2024

Ref ID: 8706

Graph-structured information offers rich contextual information that can enhance language models by providing structured relationships and hierarchies, leading to more expressive embeddings for various applications such as retrieval, question answering, and classification. However, existing methods for integrating graph and text embeddings, often based on Multi-layer Perceptrons (MLPs) or shallow transformers, are limited in their ability to fully exploit the heterogeneous nature of these modalities. To overcome this, we propose Janus, a simple yet effective framework that leverages Large Language Models (LLMs) to jointly encode text and graph data. Specifically, Janus employs an MLP adapter to project graph embeddings into the same space as text embeddings, allowing the LLM to process both modalities jointly. Unlike prior work, we also introduce contrastive learning to align the graph and text spaces more effectively, thereby improving the quality of learned joint embeddings. Empirical results across six datasets spanning three tasks, knowledge graph-contextualized question answering, graph-text pair classification, and retrieval, demonstrate that Janus consistently outperforms existing baselines, achieving significant improvements across multiple datasets, with gains of up to 11.4% in QA tasks. These results highlight Janus's effectiveness in integrating graph and text data. Ablation studies further validate the effectiveness of our method.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1896 - Lin 2023
Spatial Commonsense Reasoning for Machine Reading Comprehension

Lin, M.; Wang, M. X.; Yu, J.; Wang, S.; Lai, H.; Liu, W.; Yin, J.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14177 LNAI():347-361

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-46664-9_24 · Ref ID: 5163

This paper studies the problem of spatial commonsense reasoning for the machine reading comprehension task. Spatial commonsense is the human-shared but latent knowledge of object shape, size, distance, and position. Reasoning this abstract knowledge can facilitate machines better perceive their surroundings, which is crucial for general intelligence. However, this valuable topic is challenging and has been less studied. To bridge this research gap, we focus on this topic and propose a new method to realize spatial reasoning. Given a text, we first build a potential reasoning graph based on its parsing tree. To better support spatial reasoning, we retrieve the related commonsense entities and relations from external knowledge sources, including the pre-trained language model (LM) and knowledge graph (KG). LM covers all kinds of factual knowledge and KG has abundant commonsense relations. We then propose a new fusion method called LEGRN (LM Edge-GNN Reasoner Networks) to fuse the text and graph. LEGRN adopts layer-based attention to integrate the LM text encoder and KG graph encoder, which can capture correlations between LM text context and KG graph structure. Considering that spatial relations involve a variety of attributes, we propose an attribute-aware inferential network to deduce the correct answers. To evaluate our approach, we construct a new large-scale dataset named CRCSpatial, consisting of 40k spatial reasoning questions. Experiment results illustrated the effectiveness of our approach. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#296 - Lin 2023
Fusing topology contexts and logical rules in language models for knowledge graph completion

Lin, Q. K.; Mao, R.; Liu, J.; Xu, F. Z.; Cambria, E.

Inf. Fusion 2023;90():253-264

2023

DOI: 10.1016/j.inffus.2022.09.020 · Ref ID: 3062

Knowledge graph completion (KGC) aims to infer missing facts based on the observed ones, which is significant for many downstream applications. Given the success of deep learning and pre-trained language models (LMs), some LM-based methods are proposed for the KGC task. However, most of them focus on modeling the text of fact triples and ignore the deeper semantic information (e.g., topology contexts and logical rules) that is significant for KG modeling. For such a reason, we propose a unified framework FTL-LM to Fuse Topology contexts and Logical rules in Language Models for KGC, which mainly contains a novel path-based method for topology contexts learning and a variational expectation-maximization (EM) algorithm for soft logical rule distilling. The former utilizes a heterogeneous random-walk to generate topology paths and further reasoning paths that can represent topology contexts implicitly and can be modeled by a LM explicitly. The strategies of mask language modeling and contrastive path learning are introduced to model these topology contexts. The latter implicitly fuses logical rules by a variational EM algorithm with two LMs. Specifically, in the E-step, the triple LM is updated under the supervision of observed triples and valid hidden triples verified by the fixed rule LM. And in the M-step, we fix the triple LM and fine-tune the rule LM to update logical rules. Experiments on three common KGC datasets demonstrate the superiority of the proposed FTL-LM, e.g., it achieves 2.1% and 3.1% Hits@10 improvement over the state-of-the-art LM-based model LP-BERT in the WN18RR and FB15k-237, respectively.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1538 - Lin 2024
A Knowledge-Injected Curriculum Pretraining Framework for Question Answering

Lin, X.; Su, T.; Huang, Z.; Xue, S.; Liu, H.; Chen, E.

WWW 2024 - Proceedings of the ACM Web Conference 2024;():1986-1997

Association for Computing Machinery, Inc 2024

DOI: 10.1145/3589334.3645406 · Ref ID: 4089

Knowledge-based question answering (KBQA) is a key task in natural language processing research, and also an approach to access the web data and knowledge, which requires exploiting knowledge graphs (KGs) for reasoning. In the literature, one promising solution for KBQA is to incorporate the pretrained language model (LM) with KGs by generating KG-centered pretraining corpus, which has shown its superiority. However, these methods often depend on specific techniques and resources to work, which may not always be available and restrict its application. Moreover, existing methods focus more on improving language understanding with KGs, while neglect the more important human-like complex reasoning. To this end, in this paper, we propose a general K nowledge-I njected C urriculum P retraining framework (KICP) to achieve comprehensive KG learning and exploitation for KBQA tasks, which is composed of knowledge injection (KI), knowledge adaptation (KA) and curriculum reasoning (CR). Specifically, the KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps that could work with different implementations for flexible application. Next, the KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability to reduce the negative impacts of the difference between the generated and natural corpus. Last, to enable the LM with complex reasoning, the CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner to promote model learning. We provide an implementation of the general framework, and evaluate the proposed KICP on four real-word datasets. The results demonstrate that our framework can achieve higher performances, and have good generalization ability to other QA tasks. © 2024 ACM.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#565 - Ling 2022
MetaGNN-Based Medical Records Unstructured Specialized Vocabulary Few-Shot Representation Learning

Ling, H. X.; Luo, G. S.; Yang, Y.

IEEE Access 2022;10():118665-118675

2022

DOI: 10.1109/access.2022.3219988 · Ref ID: 3768

With the continuous breakthroughs in artificial intelligence technology, it has become easier to extract general-purpose knowledge using machine learning, but it is a challenging task to extract and learn small samples of knowledge in medical expertise. On the one hand, it is difficult to represent medical expertise entities, and on the other hand, the training samples of such expertise are small, and deep learning methods often require a large number of samples to complete the learning task. To this end, we proposes a graph network learning method for specialized vocabulary representation. Specifically, a contextual knowledge representation model based on graph meta-learning is proposed, which combines text, phrase, vocabulary, and other information to solve the problem of sparse data of medical electronic medical record entities that cannot be extracted and learned. In this method, a text-independent lexical representation learning method, a context-aware graph neural network, and a combined LSTM language model are used to model information from different perspectives as a way to learn semantic representations of professional discourse entities. The experimental results show that the accuracy of the method outperforms other similar methods and proves its effectiveness.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2985 - Lipaczewski 2013
Teaching and Training Formal Methods for Safety Critical Systems

Lipaczewski, M.; Ortmeier, F.

2013 39th Euromicro Conference on Software Engineering and Advanced Applications 2013;():408-413

2013

DOI: 10.1109/SEAA.2013.54 · Ref ID: 6574

Embedded systems become a major part in many domains. This also involves systems which might create heavy damages and injuries when they fail. However, because of the rising number of software components used within this embedded hardware, safety-related problems are hard to discover, and it is even harder to prove that there are none. One approach to guarantee the correctness of a system is model-based safety analysis. They rely on an abstract representation of the system which can then be analyzed using model checkers. The results of these analysis are in general much more precise and often reveal surprising results of failure combinations, where no one had ever thought about before. Nevertheless model-based safety analysis is not used widely. Mainly because it is not well-known and hard to apply to current safety standards which rely on manual approaches. Another fact might be, that most approaches are scientific and in most cases prototypes that are hard to use. In this paper we present some ideas and first steps towards an easy to learn and easy to use model based safety approach. Additionally we present different user-interfaces that are supposed to support the user in his learning.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1252 - Lippolis 2023
Enhancing Entity Alignment Between Wikidata and ArtGraph Using LLMs

Lippolis, A. S.; Klironomos, A.; Milon-Flores, D. F.; Zheng, H.; Jouglar, A.; Norouzi, E.; Hogan, A.

CEUR Workshop Proceedings 2023;3540():

CEUR-WS 2023

Ref ID: 5051

Knowledge graphs (KGs) are used in a wide variety of applications, including within the cultural heritage domain. An important prerequisite of such applications is the quality and completeness of the data. Using a single KG might not be enough to fulfill this requirement. The absence of connections between KGs complicates taking advantage of the complementary data they can provide. This paper focuses on the Wikidata and A rtG raph KGs, which exhibit gaps in content that can be filled by enriching one with data from the other. Entity alignment can help to combine data from KGs by connecting entities that refer to the same real-world entities. However, entity alignment in art-domain knowledge graphs remains under-explored. In the pursuit of entity alignment between A rtG raph and Wikidata, a hybrid approach is proposed. The first part, which we call WES (Wikidata Entity Search), utilizes traditional Wikidata SPARQL queries and is followed by a supplementary sequence-to-sequence large language model (LLM) pipeline that we denote as pArtLink. The combined approach successfully aligned artworks and artists, with WES identifying entities for 14,982 artworks and 2,029 artists, and pArtLink further aligning 76 additional artists, thus enhancing the alignment process beyond WES’ capabilities. © 2023 Copyright for this paper by its authors.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#322 - Lissandrini 2020
Graph-Query Suggestions for Knowledge Graph Exploration

Lissandrini, M.; Mottin, D.; Palpanas, T.; Velegrakis, Y.; Assoc Comp, Machinery

29th Web Conference (WWW) 2020;():2549-2555

Taipei, TAIWAN Assoc Computing Machinery 2020

DOI: 10.1145/3366423.3380005 · Ref ID: 2988

We consider the task of exploratory search through graph queries on knowledge graphs. We propose to assist the user by expanding the query with intuitive suggestions to provide a more informative (full) query that can retrieve more detailed and relevant answers. To achieve this result, we propose a model that can bridge graph search paradigms with well-established techniques for information-retrieval. Our approach does not require any additional knowledge from the user and builds on principled language modelling approaches. We empirically show the effectiveness and efficiency of our approach on a large knowledge graph and how our suggestions are able to help build more complete and informative queries.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#71 - Liu 2024
Bootstrapping Large Language Models for Radiology Report Generation

Liu, C.; Tian, Y. H.; Chen, W. D.; Song, Y.; Zhang, Y. D.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18635-18643

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3502

Radiology report generation (RRG) aims to automatically generate a free-text description from a specific clinical radiograph, e.g., chest X-Ray images. Existing approaches tend to perform RRG with specific models trained on the public yet limited data from scratch, where they often lead to inferior performance owing to the problem of inefficient capabilities in both aligning visual and textual features and generating informative reports accordingly. Currently, large language models (LLMs) offered a promising solution to text generation with their power in learning from big data, especially for cross-modal scenarios such as RRG. However, most existing LLMs are pre-trained on general data, and suffer from the same problem of conventional approaches caused by knowledge gap between general and medical domain if they are applied to RRG. Therefore in this paper, we propose an approach to bootstrapping LLMs for RRG with a in-domain instance induction and a coarse-to-fine decoding process. Specifically, the in-domain instance induction process learns to align the LLM to radiology reports from general texts through contrastive learning. The coarse-to-fine decoding performs a text elevating process for those reports from the ranker, further enhanced with visual features and refinement prompts. Experimental results on two prevailing RRG datasets, namely, IU X-Ray and MIMIC-CXR, demonstrate the superiority of our approach to previous state-of-the-art solutions. Further analyses illustrate that, for the LLM, the induction process enables it to better align with the medical domain and the coarse-to-fine generation allows it to conduct more precise text generation.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#546 - Liu 2024
MAKG: A maritime accident knowledge graph for intelligent accident analysis and management

Liu, D. G.; Cheng, L.

Ocean Eng. 2024;312():14

2024

DOI: 10.1016/j.oceaneng.2024.119280 · Ref ID: 3096

With the increasing frequency of human activities at sea, maritime accidents are occurring more often. Analyzing and mining maritime accident cases can help uncover the causal mechanisms behind these incidents, thereby enhancing maritime safety. As an emerging technology for knowledge management and mining, knowledge graphs offer significant support for the storage, reasoning, and decision-making processes related to maritime accidents. In this study, we established a knowledge graph construction and application framework for maritime accidents to facilitates the extraction and management of maritime knowledge from unstructured texts. First, 581 accident reports released by the China Maritime Safety Administration over the past decade (2014-2023) were used as the data basis for analysis and construction of the maritime accident ontology structure using the sevenstep method, which comprises 8 entity types, 8 relationship types, and 18 attribute entity types. Second, We proposed MBERT-BiLSTM-CRF-SF, a named entity recognition model based on domain pretraining and selftraining, to reduce graph construction costs. This model achieved state-of-the-art performance in the maritime domain, with an F1 score of 0.910 +/- 0.006, which is about 5% higher than the mainstream model. In addition, we proposed an entity alignment method based on font and semantics to refine knowledge further. On the basis of the proposed method, we constructed a large, high-quality maritime accident knowledge graph (MAKG) system that contains 16,099 entities and 20,809 relationship instances. Finally, we reduced the complexity of applying knowledge graphs by integrating the CRISPE prompt learning framework of the large language model, and experiments on graph traversal, pattern recognition, and aggregation analysis were conducted to assess the quality of MAKG. Results demonstrate that MAKG can effectively enhance the efficiency of querying and reasoning about maritime accident information, thus providing significant support for the prevention and management of maritime accidents.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3409 - Liu 2023
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge

Liu, Genglin; Wang, Xingyao; Yuan, Lifan; Chen, Yangyi; Peng, Hao

arXiv 2023;():

2023

Ref ID: 7949

Can large language models (LLMs) express their uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses? This work aims to systematically investigate LLMs' behaviors in such situations, emphasizing the trade-off between honesty and helpfulness. To tackle the challenge of precisely determining LLMs' knowledge gaps, we diagnostically create unanswerable questions containing non-existent concepts or false premises, ensuring that they are outside the LLMs' vast training data. By compiling a benchmark, UnknownBench, which consists of both unanswerable and answerable questions, we quantitatively evaluate the LLMs' performance in maintaining honesty while being helpful. Using a model-agnostic unified confidence elicitation approach, we observe that most LLMs fail to consistently refuse or express uncertainty towards questions outside their parametric knowledge, although instruction fine-tuning and alignment techniques can provide marginal enhancements. Moreover, LLMs' uncertainty expression does not always stay consistent with the perceived confidence of their textual outputs.

brandon voted
yuexi voted
Final decision
What was the agreed final decision?

#329 - Liu 2022
Heterogeneous graph prompt for Community Question Answering

Liu, H. H.; Qin, Y.

Concurr. Comput.-Pract. Exp. 2022;():10

2022

DOI: 10.1002/cpe.7156 · Ref ID: 3683

Compared with general question answering, Community Question Answering (CQA), which has been widely used in various scenarios like E-commerce and is well welcomed. In order to answer the user's question precisely, many CQA models resort to external knowledge sources such as Wikipedia. The main challenge of the task is knowledge extraction and utilization. Different from the traditional method of designing task-specific knowledge modules, we propose a graph prompt-based learning method that directly steers the pretrained language model to solve CQA tasks. Multiple information sources are organized as graph prompts to guide the generation of the model, naturally leveraging the knowledge learned in the pretrained step. Based on the pretrained bidirectional and autoregressive transformers, a large-scale language model, a comparable performance is achieved with less than 10% of the full-finetuning time by only optimizing the graph prompt parameters. Experiments on two standard CQA datasets show that compared with traditional sequential initialized prompts, graph prompt achieves 20.47% and 14.89% increments in terms of BLEU and ROUGE-L scores on quick finetuning and outperforms in few-shot learning.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1485 - Liu 2024
KNOWFORMER: Revisiting Transformers for Knowledge Graph Reasoning

Liu, J.; Mao, Q.; Jiang, W.; Li, J.

Proceedings of Machine Learning Research 2024;235():31669-31690

ML Research Press 2024

Ref ID: 4368

Knowledge graph reasoning plays a vital role in various applications and has garnered considerable attention. Recently, path-based methods have achieved impressive performance. However, they may face limitations stemming from constraints in message-passing neural networks, such as missing paths and information over-squashing. In this paper, we revisit the application of transformers for knowledge graph reasoning to address the constraints faced by path-based methods and propose a novel method KNOWFORMER. KNOWFORMER utilizes a transformer architecture to perform reasoning on knowledge graphs from the message-passing perspective, rather than reasoning by textual information like previous pretrained language model based methods. Specifically, we define the attention computation based on the query prototype of knowledge graph reasoning, facilitating convenient construction and efficient optimization. To incorporate structural information into the self-attention mechanism, we introduce structure-aware modules to calculate query, key, and value respectively. Additionally, we present an efficient attention computation method for better scalability. Experimental results demonstrate the superior performance of KNOWFORMER compared to prominent baseline methods on both transductive and inductive benchmarks. Copyright 2024 by the author(s)

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#876 - Liu 2022
VoCSK: Verb-oriented commonsense knowledge mining with taxonomy-guided induction

Liu, J. P.; Chen, T.; Wang, C.; Liang, J. Q.; Chen, L. H.; Xiao, Y. H.; Chen, Y. W.; Jin, K.

Artif. Intell. 2022;310():23

2022

DOI: 10.1016/j.artint.2022.103744 · Ref ID: 3756

Commonsense knowledge acquisition is one of the fundamental issues in realizing human- level AI. However, commonsense knowledge is difficult to obtain because it is a human consensus and rarely explicitly appears in texts or other data. In this paper, we focus on the automatic acquisition of a typical kind of implicit verb-oriented commonsense knowledge (e.g., "person eats food "), which is the concept-level knowledge of verb phrases. For this purpose, we propose a taxonomy-guided induction method to mine verb-oriented commonsense knowledge from verb phrases with the help of a probabilistic taxonomy. First, we design an entropy-based triplet filter to cope with noisy verb phrases. Then, we propose a joint model based on the minimum description length principle and a neural language model to generate verb-oriented commonsense knowledge. Besides, we introduce two strategies to accelerate the computation, including the simulated annealing-based approximate solution and the verb phrase clustering method. Finally, we conduct extensive experiments to prove that our solution is more effective than competitors in mining verb-oriented commonsense knowledge. We construct a commonsense knowledge base called VoCSK, containing 259 verbs and 18,406 verb-oriented commonsense knowledge. To verify the usefulness of VoCSK, we utilize the knowledge in this KB to improve the model performance on two downstream applications. (C) 2022 Elsevier B.V. All rights reserved.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#408 - Liu 2023
KEPT: Knowledge Enhanced Prompt Tuning for event causality identification

Liu, J. T.; Zhang, Z. Q.; Guo, Z.; Jin, L.; Li, X. Y.; Wei, K. W.; Sun, X.

Knowledge-Based Syst. 2023;259():12

2023

DOI: 10.1016/j.knosys.2022.110064 · Ref ID: 3727

Event causality identification (ECI) aims to identify causal relations of event mention pairs in text. Despite achieving certain accomplishments, existing methods are still not effective due to the following two issues: (1) the lack of causal reasoning ability, imposing restrictions on recognizing implicit causal relations; (2) the significant gap between fine-tuning and pre-training, which hinders the utilization of pre-trained language models (PLMs). In this paper, we propose a novel Knowledge Enhanced Prompt Tuning (KEPT) framework for ECI to address the issues mentioned above. Specifically, this method leverages prompt tuning to incorporate two kinds of knowledge obtained from external knowledge bases (KBs), including background information and relational information, for causal reasoning. To introduce external knowledge into our model, we first convert it to textual descriptions, then design an interactive attention mechanism and a selective attention mechanism to fuse background information and relational information, respectively. In addition, to further capture implicit relations between events, we adopt the objective from knowledge representation learning to jointly optimize the representations of causal relations and events. Experiment results on two widely-used benchmarks demonstrate that the proposed method outperforms the state-of-the-art models.(c) 2022 Elsevier B.V. All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1123 - Liu 2023
Constructing Knowledge Graph from Cyber Threat Intelligence Using Large Language Model

Liu, J.; Zhan, J.

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():516-521

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BigData59044.2023.10386611 · Ref ID: 4987

Cyber Threat Intelligence (CTI) reports are valuable resources in various applications but manually extracting information from them is time-consuming. Existing approaches for automating extraction require specialized models trained on a substantial corpus. In this paper, we present an efficient methodology for constructing knowledge graphs from CTI by leveraging the Large Language Model (LLM), using ChatGPT for instance. Our approach automatically extracts attack-related entities and their relationships, organizing them within a CTI knowledge graph. We evaluate our approach on 13 CTIs, demonstrating better performance compared to AttacKG and REBEL while requiring less manual intervention and computational resources. This proves the feasibility and suitability of our method in low-resource scenarios, specifically within the domain of cyber threat intelligence. © 2023 IEEE.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3806 - Liu 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering

Liu, Jiacheng; Hallinan, Skyler; Lu, Ximing; He, Pengfei; Welleck, Sean; Hajishirzi, Hannaneh; Choi, Yejin

arXiv 2022;():

2022

Ref ID: 7584

Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the performance even on top of state-of-the-art. The fundamental challenge is where and how to find such knowledge that is high quality and on point with respect to the question; knowledge retrieved from knowledge bases are incomplete and knowledge generated from language models are inconsistent. We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. Our approach starts by imitating knowledge generated by GPT-3, then learns to generate its own knowledge via reinforcement learning where rewards are shaped based on the increased performance on the resulting question answering. Rainier demonstrates substantial and consistent performance gains when tested over 9 different commonsense benchmarks: including 5 datasets that are seen during model training, as well as 4 datasets that are kept unseen. Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from GPT-3.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3351 - Liu 2024
DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs

Liu, Jinzhe; Huang, Xiangsheng; Chen, Zhuo; Fang, Yin

arXiv 2024;():

2024

Ref ID: 8427

Large Language Models (LLMs) encounter challenges with the unique syntax of specific domains, such as biomolecules. Existing fine-tuning or modality alignment techniques struggle to bridge the domain knowledge gap and understand complex molecular data, limiting LLMs' progress in specialized fields. To overcome these limitations, we propose an expandable and adaptable non-parametric knowledge injection framework named Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at enhancing reasoning capabilities in specific domains. Utilizing knowledge-aware prompts and gold label-induced reasoning, DRAK has developed profound expertise in the molecular domain and the capability to handle a broad spectrum of analysis tasks. We evaluated two distinct forms of DRAK variants, proving that DRAK exceeds previous benchmarks on six molecular tasks within the Mol-Instructions dataset. Extensive experiments have underscored DRAK's formidable performance and its potential to unlock molecular insights, offering a unified paradigm for LLMs to tackle knowledge-intensive tasks in specific domains. Our code will be available soon.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#667 - Liu 2024
PrimeNet: A Framework for Commonsense Knowledge Representation and Reasoning Based on Conceptual Primitives

Liu, Q.; Han, S. J.; Cambria, E.; Li, Y.; Kwok, K.

Cogn. Comput. 2024;():28

2024

DOI: 10.1007/s12559-024-10345-6 · Ref ID: 3769

Commonsense knowledge acquisition and representation is a core topic in artificial intelligence (AI), which is crucial for building more sophisticated and human-like AI systems. However, existing commonsense knowledge bases organize facts in an isolated manner like bag of facts, lacking the cognitive-level connections that humans commonly possess. People have the ability to efficiently organize vast amounts of knowledge by linking or generalizing concepts using a limited set of conceptual primitives that serve as the fundamental building blocks of reasoning. These conceptual primitives are basic, foundational elements of thought that humans use to make sense of the world. By combining and recombining these primitives, people can construct complex ideas, solve problems, and understand new concepts. To emulate this cognitive mechanism, we design a new commonsense knowledge base, termed PrimeNet, organized in a three-layer structure: a small core of conceptual primitives (e.g., FOOD), a bigger set of concepts that connect to such primitives (e.g., fruit), and an even larger layer of entities connecting to the concepts (e.g., banana). First, we collect commonsense knowledge and employ a gradual expansion strategy for knowledge integration. After refinement, PrimeNet contains 6 million edges between 2 million nodes, with 34 different types of relations. Then, we design a new conceptualization method by leveraging a probabilistic taxonomy, to build the concept layer of PrimeNet. Finally, we conduct primitive detection to build the primitive layer, where a lexical substitution task is used to identify related concepts, and large language models are employed to generate a rational primitive to label each concept cluster as well as verify the primitive detection process.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#709 - Liu 2022
Relational Memory-Augmented Language Models

Liu, Q.; Yogatama, D.; Blunsom, P.

Trans. Assoc. Comput. Linguist. 2022;10():555-572

2022

DOI: 10.1162/tacl_a_00476 · Ref ID: 2945

We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text generation. Experiments on WikiText-103, WMT19, and enwik8 English datasets demonstrate that our approach produces a better language model in terms of perplexity and bits per character. We also show that relational memory improves coherence, is complementary to token-based memory, and enables causal interventions. Our model provides a simple yet effective way to combine an autoregressive language model and a knowledge graph for more coherent and logical generation.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1968 - Liu 2024
Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach

Liu, S.; Wu, H.; Deng, G.; Chen, J.; Wang, X.; Song, L.

IEEE J. Sel. Top. Sign. Proces. 2024;():1-13

2024

DOI: 10.1109/JSTSP.2024.3414147 · Ref ID: 4519

Knowledge-enhanced text generation aims to enhance the quality of generated text by utilizing internal or external knowledge sources. While language models have demonstrated impressive capabilities in generating coherent and fluent text, the lack of interpretability presents a substantial obstacle. The limited interpretability of generated text significantly impacts its practical usability, particularly in knowledge-enhanced text generation tasks that necessitate reliability and explainability. Existing methods often employ domain-specific knowledge retrievers that are tailored to specific data characteristics, limiting their generalizability to diverse data types and tasks. To overcome this limitation, we directly leverage the two-tier architecture of structured knowledge, consisting of high-level entities and lowlevel knowledge triples, to design our task-agnostic structured knowledge hunter. Specifically, we employ a local-global interaction scheme for structured knowledge representation learning and a hierarchical transformer-based pointer network as the backbone for selecting relevant knowledge triples and entities. By combining the strong generative ability of language models with the high faithfulness of the knowledge hunter, our model achieves high interpretability, enabling users to comprehend the model&#x0027;s output generation process. Furthermore, we empirically demonstrate the effectiveness of our model in both internal knowledge-enhanced table-to-text generation on the RotoWire- FG dataset and external knowledge-enhanced dialogue response generation on the KdConv dataset. Our task-agnostic model outperforms state-of-the-art methods and corresponding language models, setting new standards on the benchmark. IEEE

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#890 - Liu 2023
Zero-Shot Text Classification with Semantically Extended Textual Entailment

Liu, T. F.; Hu, Y. L.; Chen, P.; Sun, Y. F.; Yin, B. C.; Ieee

International Joint Conference on Neural Networks (IJCNN) 2023;():

Broadbeach, AUSTRALIA Ieee 2023

DOI: 10.1109/ijcnn54540.2023.10191094 · Ref ID: 3373

Zero-shot text classification (0SHOT-TC) aims to detect classes that the model never seen in the training set, and has attracted much attention in the research community of Natural Language Processing (NLP). The emergence of pretrained language models has fostered the progress of 0SHOT-TC, which turns the task into a textual entailment problem of binary classification. It learns an entailment relatedness (yes/no) between the given sentence (premise) and each category (hypothesis) separately. However, the hypothesis generation paradigms need to be further studied, since the label itself or the label descriptions have limited ability to fully express the category space. Conversely, humans can easily extend a set of words describing the categories to be classified. In this paper, we propose a novel zero-shot text classification method called Semantically Extended Textual Entailment (SETE), which imitates the human's ability in knowledge extension. In the proposed method, three semantic extension methods are used to enrich the categories through a combination of static knowledge (e.g. expert knowledge, knowledge graph) and dynamic knowledge (e.g. language models), and the textual entailment model is finally used for 0SHOT-TC. The experimental results on the benchmarks show that our approach significantly outperforms the current methods in both generalized and nongeneralized 0SHOT-TC.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3403 - Liu 2024
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs

Liu, Xiaoze; Wu, Feijie; Xu, Tianyang; Chen, Zhuo; Zhang, Yichi; Wang, Xiaoqian; Gao, Jing

arXiv 2024;():

2024

Ref ID: 8209

The advent of Large Language Models (LLMs) has significantly transformed the AI landscape, enhancing machine learning and AI capabilities. Factuality issue is a critical concern for LLMs, as they may generate factually incorrect responses. In this paper, we propose GraphEval to evaluate an LLM's performance using a substantially large test dataset. Specifically, the test dataset is retrieved from a large knowledge graph with more than 10 million facts without expensive human efforts. Unlike conventional methods that evaluate LLMs based on generated responses, GraphEval streamlines the evaluation process by creating a judge model to estimate the correctness of the answers given by the LLM. Our experiments demonstrate that the judge model's factuality assessment aligns closely with the correctness of the LLM's generated outputs, while also substantially reducing evaluation costs. Besides, our findings offer valuable insights into LLM performance across different metrics and highlight the potential for future improvements in ensuring the factual integrity of LLM outputs. The code is publicly available at https://github.com/xz-liu/GraphEval.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#418 - Liu 2023
Knowledge Base Question Answering via Semantic Analysis

Liu, Y. B.; Zhang, H. S.; Zong, T.; Wu, J. P.; Dai, W.

Electronics 2023;12(20):14

2023

DOI: 10.3390/electronics12204224 · Ref ID: 3715

Knowledge Question Answering is one of the important research directions in the field of robot intelligence. It is mainly based on background knowledge to analyze users' questions and generate answers. It is one of the important application methods of knowledge graph technology. Compared with the traditional expert system of question and answer, it has the advantage of a large-scale background knowledge base and the traceability and interpretability of the question-answering process. Compared with the current ChatGPT (Chat Generative Pre-trained Transformer) technology, it has advantages in the proprietary segmentation field. Aiming at the problem of the accuracy of existing knowledge question-answering methods being low, this paper studies the method of semantic analysis for knowledge question-answering under the support of a knowledge database, proposes a knowledge question-answering method based on the superposition of multiple neural network models, and conducts experimental verification on the publicly available NLPCC2016KBQA(Knowledge Q&A Tasks in the 2016 Natural Language Processing and Chinese Computing Conference) data set. The experimental results show that the F1 value of this method is higher than that of the baseline model.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#542 - Liu 2023
Local and Global: Temporal Question Answering via Information Fusion

Liu, Y. H.; Liang, D.; Li, M. Y.; Giunchiglia, F.; Li, X. M.; Wang, S. R.; Wu, W.; Huang, L.; Feng, X. Y.; Guan, R. C.

32nd International Joint Conference on Artificial Intelligence (IJCAI) 2023;():5141-5149

Macao, PEOPLES R CHINA Ijcai-Int Joint Conf Artif Intell 2023

Ref ID: 3489

Many models that leverage knowledge graphs (KGs) have recently demonstrated remarkable success in question answering (QA) tasks. In the real world, many facts contained in KGs are time-constrained thus temporal KGQA has received increasing attention. Despite the fruitful efforts of previous models in temporal KGQA, they still have several limitations. (I) They neither emphasize the graph structural information between entities in KGs nor explicitly utilize a multi-hop relation path through graph neural networks to enhance answer prediction. (II) They adopt pre-trained language models (LMs) to obtain question representations, focusing merely on the global information related to the question while not highlighting the local information of the entities in KGs. To address these limitations, we introduce a novel model that simultaneously explores both Local information and Global information for the task of temporal KGQA (LGQA). Specifically, we first introduce an auxiliary task in the temporal KG embedding procedure to make timestamp embeddings time-order aware. Then, we design information fusion layers that effectively incorporate local and global information to deepen question understanding. We conduct extensive experiments on two benchmarks, and LGQA significantly outperforms previous state-of-the-art models, especially in difficult questions. Moreover, LGQA can generate interpretable and trustworthy predictions.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#123 - Liu 2020
Commonsense Evidence Generation and Injection in Reading Comprehension

Liu, Y.; Yang, T.; You, Z. Y.; Fan, W.; Yu, P. S.; Assoc Computat, Linguist

21st Annual Meeting of the Special-Interest-Group-on-Discourse-and-Dialogue (SIGDIAL) 2020;():61-73

Electr Network Assoc Computational Linguistics 2020

Ref ID: 3433

Human tackle reading comprehension not only based on the given context itself but often rely on the commonsense beyond. To empower the machine with commonsense reasoning, in this paper, we propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI. The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking. Specifically, we build two evidence generators: one aims to generate textual evidence via a language model; the other aims to extract factual evidence (automatically aligned text-triples) from a commonsense knowledge graph after graph completion. Those evidences incorporate contextual commonsense and serve as the additional inputs to the reasoning model. Thereafter, we propose a deep contextual encoder to extract semantic relationships among the paragraph, question, option, and evidence. Finally, we employ a capsule network to extract different linguistic units (word and phrase) from the relations, and dynamically predict the optimal option based on the extracted units. Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches and achieves the highest accuracy (83.6%) on the leaderboard.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3740 - Lo 2023
On Exploring the Reasoning Capability of Large Language Models with Knowledge Graphs

Lo, Pei-Chi; Tsai, Yi-Hang; Lim, Ee-Peng; Hwang, San-Yih

arXiv 2023;():

2023

Ref ID: 7964

This paper examines the capacity of LLMs to reason with knowledge graphs using their internal knowledge graph, i.e., the knowledge graph they learned during pre-training. Two research questions are formulated to investigate the accuracy of LLMs in recalling information from pre-training knowledge graphs and their ability to infer knowledge graph relations from context. To address these questions, we employ LLMs to perform four distinct knowledge graph reasoning tasks. Furthermore, we identify two types of hallucinations that may occur during knowledge reasoning with LLMs: content and ontology hallucination. Our experimental results demonstrate that LLMs can successfully tackle both simple and complex knowledge graph reasoning tasks from their own memory, as well as infer from input context.

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#494 - Lombardo 2024
Language Models Fine-Tuning for Automatic Format Reconstruction of SEC Financial Filings

Lombardo, G.; Trimigno, G.; Pellegrino, M.; Cagnoni, S.

IEEE Access 2024;12():31249-31261

2024

DOI: 10.1109/access.2024.3370444 · Ref ID: 3631

The analysis of financial reports is a crucial task for investors and regulators, especially the mandatory annual reports (10-K) required by the SEC (Securities and Exchange Commission) that provide crucial information about a public company in the American stock market. Although SEC suggests a specific document format to standardize and simplify the analysis, in recent years, several companies have introduced their own format and organization of the contents, making human-based and automatic knowledge extraction inherently more difficult. In this research work, we investigate different Neural language models based on Transformer networks (Bidirectional recurrence-based, Autoregressive-based, and Autoencoders-based approaches) to automatically reconstruct an SEC-like format of the documents as a multi-class classification task with 18 classes at the sentence level. In particular, we propose a Bidirectional fine-tuning procedure to specialize pre-trained language models on this task. We propose and make the resulting novel transformer model, named SEC-former, publicly available to deal with this task. We evaluate SEC-former in three different scenarios: 1) in terms of topic detection performances; 2) in terms of document similarity (TF-IDF Bag-of-words and Doc2Vec) achieved with respect to original and trustable financial reports since this operation is leveraged for portfolio optimization tasks; and 3) testing the model in a real use-case scenario related to a public company that does not respect the SEC format but provides a human-supervised reference to reconstruct it.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#789 - Lonergan 2024
Stratified Evaluation of Large Language Model GPT-4's Question-Answering In Surgery reveals AI Knowledge Gaps

Lonergan, R. M.; Curry, J.; Dhas, K.; Simmons, B.

Br. J. Surg. 2024;111():1

2024

DOI: 10.1093/bjs/znae046.050 · Ref ID: 3793

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#788 - Lonergan 2023
Stratified Evaluation of GPT's Question Answering in Surgery Reveals Artificial Intelligence (AI) Knowledge Gaps

Lonergan, R. M.; Curry, J.; Dhas, K.; Simmons, B. I.

Cureus J Med Sci 2023;15(11):8

2023

DOI: 10.7759/cureus.48788 · Ref ID: 3642

Large language models (LLMs) have broad potential applications in medicine, such as aiding with education, providing reassurance to patients, and supporting clinical decision-making. However, there is a notable gap in understanding their applicability and performance in the surgical domain and how their performance varies across specialties. This paper aims to evaluate the performance of LLMs in answering surgical questions relevant to clinical practice and to assess how this performance varies across different surgical specialties.We used the MedMCQA dataset, a large-scale multi-choice question-answer (MCQA) dataset consisting of clinical questions across all areas of medicine. We extracted the relevant 23,035 surgical questions and submitted them to the popular LLMs Generative Pre-trained Transformers (GPT)-3.5 and GPT-4 (OpenAI OpCo, LLC, San Francisco, CA). Generative Pre-trained Transformer is a large language model that can generate human-like text by predicting subsequent words in a sentence based on the context of the words that come before it. It is pre-trained on a diverse range of texts and can perform a variety of tasks, such as answering questions, without needing task-specific training. The question-answering accuracy of GPT was calculated and compared between the two models and across surgical specialties. Both GPT-3.5 and GPT-4 achieved accuracies of 53.3% and 64.4%, respectively, on surgical questions, showing a statistically significant difference in performance. When compared to their performance on the full MedMCQA dataset, the two models performed differently: GPT-4 performed worse on surgical questions than on the dataset as a whole, while GPT-3.5 showed the opposite pattern. Significant variations in accuracy were also observed across different surgical specialties, with strong performances in anatomy, vascular, and paediatric surgery and worse performances in orthopaedics, ENT, and neurosurgery.Large language models exhibit promising capabilities in addressing surgical questions, although the variability in their performance between specialties cannot be ignored. The lower performance of the latest GPT-4 model on surgical questions relative to questions across all medicine highlights the need for targeted improvements and continuous updates to ensure relevance and accuracy in surgical applications. Further research and continuous monitoring of LLM performance in surgical domains are crucial to fully harnessing their potential and mitigating the risks of misinformation.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1648 - Lorenzo 2024
Mitigating Data Scarcity in Semantic Parsing across Languages: the Multilingual Semantic Layer and its Dataset

Lorenzo, A. C. M.; Cabot, P. L. H.; Ghonim, K.; Xu, L.; Choi, H. S.; Castro, A. F.; Navigli, R.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14056-14080

Association for Computational Linguistics (ACL) 2024

Ref ID: 4220

Data scarcity is a prevalent challenge in the era of Large Language Models (LLMs). The insatiable hunger of LLMs for large corpora becomes even more pronounced when dealing with non-English and low-resource languages. The issue is particularly exacerbated in Semantic Parsing (SP), i.e. the task of converting text into a formal representation. The complexity of semantic formalisms makes training human annotators and subsequent data annotation unfeasible on a large scale, especially across languages. To mitigate this, we first introduce the Multilingual Semantic Layer (MSL), a conceptual evolution of previous formalisms, which decouples from disambiguation and external inventories and simplifies the task. MSL provides the necessary tools to encode the meaning across languages, paving the way for developing a high-quality semantic parsing dataset across different languages in a semi-automatic strategy. Subsequently, we manually refine a portion of this dataset and fine-tune GPT-3.5 to propagate these refinements across the dataset. Then, we manually annotate 1,100 sentences in eleven languages, including low-resource ones. Finally, we assess our dataset's quality, showcasing the performance gap reduction across languages in Semantic Parsing. Our code and dataset are openly available at https://github.com/SapienzaNLP/MSL. © 2024 Association for Computational Linguistics.

Kwesi voted
yuexi voted
Final decision
What was the agreed final decision?

#1878 - Lotfy 2024
Sentiment Analysis for Arabic Product Reviews using LLMs and Knowledge Graphs

Lotfy, A.; Saleh, K.; Mohamed, S.; Lorance, J.; Yehia, E.; Mohammed, K.; AbdAlbaky, I.; Fathy, M.; Yasser, T.

6th International Conference on Computing and Informatics, ICCI 2024 2024;():411-417

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICCI61671.2024.10485037 · Ref ID: 4687

The exploration of sentiment analysis in multilingual contexts, particularly through the integration of deep learning techniques and knowledge graphs, represents a significant advance in language processing research. This study specifically concentrates on the Arabic language, addressing the challenges presented by its morphological complexity. While the primary focus is Arabic, the research also includes a comprehensive review of related work in other languages such as Bangla and Chinese. This contextualizes the challenges and solutions found in Arabic sentiment analysis within a broader multilingual landscape. Utilizing pre-trained language models like BERT, the research has achieved noteworthy improvements in sentiment analysis accuracy and efficiency, particularly for the Arabic language. The integration of knowledge graphs stands out as a crucial innovation, offering essential contextual insights and mitigating the limitations posed by sparse labeled datasets in Arabic, a language less resourced compared to English. The findings of this study highlight the effectiveness of tailored BERT models for Arabic sentiment analysis, revealing the vast potential and inherent challenges of employing knowledge graphs and large language models for a deeper, more nuanced understanding. The future direction of this research includes enhancing these methods with cutting-edge machine learning techniques, aiming to further refine sentiment analysis processes and knowledge graph construction with a focus on Arabic within a multilingual framework. © 2024 IEEE.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#856 - Lourie 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark

Lourie, N.; Le Bras, R.; Bhagavatula, C.; Choi, Y. J.; Assoc Advancement Artificial, Intelligence

35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence 2021;35():13480-13488

Electr Network Assoc Advancement Artificial Intelligence 2021

Ref ID: 3543

Commonsense AI has long been seen as a near impossible goal until recently. Now, research interest has sharply increased with an influx of new benchmarks and models. We propose two new ways to evaluate commonsense models, emphasizing their generality on new tasks and building on diverse, recently introduced benchmarks. First, we propose a new multitask benchmark, RAINBOW, to promote research on commonsense models that generalize well over multiple tasks and datasets. Second, we propose a novel evaluation, the cost equivalent curve, that sheds new insight on how the choice of source datasets, pretrained language models, and transfer learning methods impacts performance and data efficiency. We perform extensive experiments over 200 experiments encompassing 4800 models and report multiple valuable and sometimes surprising findings, e.g., that transfer almost always leads to better or equivalent performance if following a particular recipe, that QA -based commonsense datasets transfer well with each other, while commonsense knowledge graphs do not, and that perhaps counter-intuitively, larger models benefit more from transfer than smaller ones. Last but not least, we introduce a new universal commonsense reasoning model, UNICORN, that establishes new state-of-the-art performance across 8 popular commonsense benchmarks, ALI (-*873%), CosmosQA (-*91.8%), HELLASWAG (-*93.9%), PIQA (-*90.1%), SociALIQA (-*83.2%), WINOGRANDE (-*86.6%), CYCIC (-*94.0%) and CommoNsENsEQA (-*79.3%).

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1330 - Lovelace 2022
A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion

Lovelace, J.; Rosé, C. P.

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5937-5955

Association for Computational Linguistics (ACL) 2022

Ref ID: 5468

Recent work has demonstrated that entity representations can be extracted from pre-trained language models to develop knowledge graph completion models that are more robust to the naturally occurring sparsity found in knowledge graphs. In this work, we conduct a comprehensive exploration of how to best extract and incorporate those embeddings into knowledge graph completion models. We explore the suitability of the extracted embeddings for direct use in entity ranking and introduce both unsupervised and supervised processing methods that can lead to improved downstream performance. We then introduce supervised embedding extraction methods that can extract more informative representations. We then synthesize our findings and develop a knowledge graph completion model that significantly outperforms recent neural models. © 2022 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#2766 - Lozano 2014
Ontology View Extraction: An Approach Based on Ontological Meta-properties

Lozano, J.; Carbonera, J.; Abel, M.; Pimenta, M.

2014 IEEE 26th International Conference on Tools with Artificial Intelligence 2014;():122-129

2014

DOI: 10.1109/ICTAI.2014.28 · Ref ID: 6111

Ontologies have been applied in Computer Science to ensure the semantic interoperability among multiple systems. With the increasing of ontologies availability, many approaches for promoting the share and reuse of ontologies have been investigated in recent years, like ontology module extraction (modularization) and ontology view extraction. Approaches for ontology module extraction are used for extracting modules in large ontologies. On the other hand, ontology views have been used for providing to the user only the parts of the ontology that are useful for a given task. Thus, both ontology views and ontology modules encapsulate a subset of the original ontology, but they have different purposes. The literature has explored the ontological meta-properties (such as identity and rigidity) for guiding the modeling decisions that are made during the ontology engineering process. Some of these meta-properties were formalized in foundational ontologies like UFO (Unified foundational ontology). In this paper, we explore the use of ontological meta-properties for extracting ontology views. We propose a characterization of the notion of well-founded ontology view, considering the ontological meta-properties of the concepts. Besides that, we also propose a language-independent algorithm for sub-ontology extraction that is guided by ontological meta-properties. Finally, we present a case study for illustrating the application of our approach.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#334 - Lu 2023
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting

Lu, J. Y.; Shen, J. M.; Xiong, B.; Ma, W. J.; Staab, S.; Yang, C.; Acm

46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2023;():2052-2056

Taipei, TAIWAN Assoc Computing Machinery 2023

DOI: 10.1145/3539618.3591997 · Ref ID: 3437

Medical decision-making processes can be enhanced by comprehensive biomedical knowledge bases, which require fusing knowledge graphs constructed from different sources via a uniform index system. The index system often organizes biomedical terms in a hierarchy to provide the aligned entities with fine-grained granularity. To address the challenge of scarce supervision in the biomedical knowledge fusion (BKF) task, researchers have proposed various unsupervised methods. However, these methods heavily rely on ad-hoc lexical and structural matching algorithms, which fail to capture the rich semantics conveyed by biomedical entities and terms. Recently, neural embedding models have proved effective in semantic-rich tasks, but they rely on sufficient labeled data to be adequately trained. To bridge the gap between the scarce-labeled BKF and neural embedding models, we propose HiPrompt, a supervision-efficient knowledge fusion framework that elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts. Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#3463 - Lu 2024
Generative Design of Functional Metal Complexes Utilizing the Internal Knowledge of Large Language Models

Lu, Jieyu; Song, Zhangde; Zhao, Qiyuan; Du, Yuanqi; Cao, Yirui; Jia, Haojun; Duan, Chenru

arXiv 2024;():

2024

Ref ID: 8746

Designing functional transition metal complexes (TMCs) faces challenges due to the vast search space of metals and ligands, requiring efficient optimization strategies. Traditional genetic algorithms (GAs) are commonly used, employing random mutations and crossovers driven by explicit mathematical objectives to explore this space. Transferring knowledge between different GA tasks, however, is difficult. We integrate large language models (LLMs) into the evolutionary optimization framework (LLM-EO) and apply it in both single- and multi-objective optimization for TMCs. We find that LLM-EO surpasses traditional GAs by leveraging the chemical knowledge of LLMs gained during their extensive pretraining. Remarkably, without supervised fine-tuning, LLMs utilize the full historical data from optimization processes, outperforming those focusing only on top-performing TMCs. LLM-EO successfully identifies eight of the top-20 TMCs with the largest HOMO-LUMO gaps by proposing only 200 candidates out of a 1.37 million TMCs space. Through prompt engineering using natural language, LLM-EO introduces unparalleled flexibility into multi-objective optimizations, thereby circumventing the necessity for intricate mathematical formulations. As generative models, LLMs can suggest new ligands and TMCs with unique properties by merging both internal knowledge and external chemistry data, thus combining the benefits of efficient optimization and molecular generation. With increasing potential of LLMs as pretrained foundational models and new post-training inference strategies, we foresee broad applications of LLM-based evolutionary optimization in chemistry and materials design.

yuexi voted
Kwesi voted
Final decision
What was the agreed final decision?

#654 - Lu 2023
PKAT: Pre -training in Collaborative Knowledge Graph Attention Network for Recommendation

Lu, Y. H.; Wang, C. D.; Lai, P. Y.; Lai, J. H.

23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():448-457

Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023

DOI: 10.1109/icdm58522.2023.00054 · Ref ID: 3011

With the rapid growth of online platforms and the abundance of available information, personalized recommender systems have become essential for assisting users in discovering relevant and interesting content. Among the various methods, knowledge -aware recommendation model has achieved notable success by leveraging the rich semantic information encoded in knowledge graphs. However, it overlooks the fact that users' historical click sequences can better reflect their preferences within a period of time, thus imposing certain limitations on the recommendation performance. On the other hand, the application of pre-trained language models in recommender systems has demonstrated increasingly significant potential, as they can capture sequential patterns and dependencies within users' historical click sequences and effectively capture contextual information in user-item interactions. To this end, we propose a hybrid recommendation model that leverages Pre-training in the collaborative Knowledge graph Attention neTwork (PKAT), to extract both the high-order connectivity information in collaborative knowledge graphs and the contextual information in users' historical click sequences captured by Bidirectional Encoder Representations from Transformers (BERT). The collaborative knowledge graph attention network enables the model to effectively capture the intricate relationships between users, items, and knowledge entities, thus enhancing the representation learning process. Furthermore, what sets PKAT apart from other state-of-the-art knowledgeaware recommendation methods is the incorporation of the BERT language model. This integration allows PKAT to capture the contextual sequence information of user behavior, enabling it to generate more accurate and personalized recommendations. Extensive experiments are conducted on multiple benchmark dalasets. And the results demonstrate that our PKAT model outperforms several state-of-the-art baselines.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1014 - Lu 2003
Automatically acquiring Chinese parsing knowledge based on a bilingual language model

Lu, Y. J.; Li, S.; Zhao, T. J.

Jisuanji Xuebao 2003;26(1):32-38

2003

Ref ID: 5825

Knowledge acquisition is a bottleneck for real application of Chinese Parsing. This paper presents a new method to acquire Chinese parsing knowledge from sentence aligned English-Chinese bilingual corpora. Using English parsing and word alignment results, this method first implements bilingual structure alignment based on a bilingual language model-Inversion Transduction Grammars. Then, Chinese Bracketing structures are extracted automatically. The method creates structure bracketing Chinese corpora by taking full advantage of English parsing and bilingual corpora. The created corpora are very useful for further Chinese corpus annotation and parsing knowledge acquisition. Preliminary experiments show that the acquired knowledge accord well with manually made knowledge. This method is particularly useful to acquire parsing knowledge for a language lacking of studied from a second language that well studied. Although this paper is related to Chinese and English, the proposed method is also applicable to other language pairs.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1081 - Lu 2024
ClinicalRAG: Enhancing Clinical Decision Support through Heterogeneous Knowledge Retrieval

Lu, Y.; Wang, J.; Zhao, X.

KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():64-68

Association for Computational Linguistics (ACL) 2024

Ref ID: 4281

Large Language Models (LLMs) have revolutionized text generation across diverse domains, showcasing an ability to mimic human-like text with remarkable accuracy. Yet, these models frequently encounter a significant hurdle: producing hallucinations, a flaw particularly detrimental in the healthcare domain where precision is crucial. In this paper, we introduce ClinicalRAG, a novel multi-agent pipeline to rectify this issue by incorporating heterogeneous medical knowledge-both structured and unstructured-into LLMs to bolster diagnosis accuracy. ClinicalRAG can extract related medical entities from user inputs and dynamically integrate relevant medical knowledge during the text generation process. Comparative analyses reveal that ClinicalRAG significantly outperforms knowledge-deficient methods, offering enhanced reliability in clinical decision support. This advancement marks a pivotal proof-of-concept step towards mitigating misinformation risks in healthcare applications of LLMs. © 2024 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1218 - Lu 2024
Dynamic Reasoning with Language Model and Knowledge Graph for Question Answering

Lu, Y.; Wu, D.; Zhang, Y.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14807 LNCS():441-455

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-3-031-70546-5_26 · Ref ID: 4213

The question answering(QA) involves reasoning about the context and latent knowledge of complex textual descriptions. Current research is how to effectively utilize knowledge graph(KG) to enhance language model(LM) with external knowledge. In previous works, the interactions between the QA context and KG were limited, and KG input to the model contained noisy nodes, greatly restricting the model’s reasoning ability. We propose a dynamic reasoning model, DLM-KG, which is based on LM and KG. It resolves the above challenges through dynamic hierarchical interaction between QA context and KG, joint reasoning between LM and KG, and dynamic pruning of the KG. Specifically, DLM-KG extracts hierarchical features from KG representations and performs inter-layer and intra-layer interactions in each iteration. The features from interactions enter the joint reasoning module, where each QA context feature and KG feature mutually attend to each other. The representations of the two modalities are fused and updated through multi-step interactions. Finally, using the information provided by the interaction layer, irrelevant nodes in the KG are removed. Experiments conducted on the commonsense datasets CommonsenseQA, OpenbookQA, and the medical question and answer dataset MedQA-USMLE show that the performance on the MedQA-USMLE dataset is superior to baseline models, and on other datasets, the performance is close to baseline models, demonstrating its competitiveness in terms of reasoning ability. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1126 - Lu 2021
Construction of Diabetes Knowledge Graph Based on Deep Learning

Lu, Y.; Zhao, R.; Huang, S.; Liu, R.

Proceedings - 2021 7th Annual International Conference on Network and Information Systems for Computers, ICNISC 2021 2021;():966-970

Institute of Electrical and Electronics Engineers Inc. 2021

DOI: 10.1109/ICNISC54316.2021.00181 · Ref ID: 5664

To integrate medical data which is scattered over the internet, natural language processing (NLP) is widely used in medical text mining. BERT (Bidirectional Encoder Representations from Transformers) is outstanding among many other representation models and vector representation based on Bert pre-Training language model can help the target task learn more semantic information. The knowledge graph intuitively reveals the relationship between entities and helps explore deeper semantic connections between entities. There are three important parts in the construction of a knowledge graph, including entity extraction, relation extraction, and graph generation. Based on these methods this paper proposes a Bert-based named entities identification model Bert-BiLSTM-CRF and it is outperforming the established methods. In the relation extraction part, use the BERT-Softmax to improve the semantic expression and its F1-value increased by 12 percent compared with the traditional entity relation extraction model. Based on the above redefined the entities of diabetes and their relationships to enrich the semantics of the knowledge graph. Finally, the Neo4j graph database was used to realize the visualization of the diabetes knowledge map. © 2021 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3884 - Lu 2022
Structured Knowledge Grounding for Question Answering

Lu, Yujie; Ouyang, Siqi; Zhou, Kairui

arXiv 2022;():

2022

Ref ID: 7579

Can language models (LM) ground question-answering (QA) tasks in the knowledge base via inherent relational reasoning ability? While previous models that use only LMs have seen some success on many QA tasks, more recent methods include knowledge graphs (KG) to complement LMs with their more logic-driven implicit knowledge. However, effectively extracting information from structured data, like KGs, empowers LMs to remain an open question, and current models rely on graph techniques to extract knowledge. In this paper, we propose to solely leverage the LMs to combine the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning. Specifically, we devise a knowledge construction method that retrieves the relevant context with a dynamic hop, which expresses more comprehensivenes than traditional GNN-based techniques. And we devise a deep fusion mechanism to further bridge the information exchanging bottleneck between the language and the knowledge. Extensive experiments show that our model consistently demonstrates its state-of-the-art performance over CommensenseQA benchmark, showcasing the possibility to leverage LMs solely to robustly ground QA into the knowledge base.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1640 - Luan 2024
A Methodology for Generating and Optimizing Chain-of-Thought Based on Knowledge Graphs

Luan, Q.

Advances in Transdisciplinary Engineering 2024;47():313-324

IOS Press BV 2024

DOI: 10.3233/ATDE231203 · Ref ID: 4126

One of the critical indicators for assessing the practical applicability of large language models is their competency in vertical domain question-answering tasks. However, in real-world applications, fine-tuning these large models often compromises their inherent capabilities. Moreover, fine-tuning does not offer precise control over the model's generated outputs.Consequently, enhancing the question-answering performance of large models in specialized domains has become a focal concern in the field. To address these challenges, this paper introduces a novel approach for generating and optimizing a 'Chain-of-Thought'(CoT), leveraging domain-specific knowledge graphs. Specifically, we propose a Knowledge Graph-generated Chain of Thought (KGCoT) method that utilizes graph search algorithms to generate a chain of thought. This chain guides the injection of specialized knowledge into large language models and adapts the weightings based on user feedback, thereby optimizing subsequent graph searches.Heuristic searches are performed on the knowledge graph based on edge weights, culminating in the amalgamation of discovered entities and knowledge into a chain of thought. This KGCoT serves as a prompt to stimulate the large language model's contemplation of domain-specific knowledge. Additionally, an adaptive weight optimization formula refines the chain's weights in response to output feedback, thereby continually enhancing the quality of future search results and ensuring real-time optimization capabilities for the model.Through empirical evaluations conducted on publicly available datasets, the large language model ChatGLM, when prompted with a KGCoT, exhibited a 72.8% improvement in its BLEU score compared to its baseline performance. This outperformed other models like LLaMA and RWKV, unequivocally substantiating the efficacy of the proposed KGCoT method. © 2024 The Authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1965 - LuísFerreira 2024
Towards Automated Evaluation of Knowledge Encoded in Large Language Models

Luís Ferreira, B. C.; Silva, C.; Gonçalo Oliveira, H.

Proceedings of the Workshop on DLnLD 2024: Deep Learning and Linked Data at LREC-COLING 2024 - Workshop Proceedings 2024;():76-85

European Language Resources Association (ELRA) 2024

Ref ID: 4642

Large Language Models (LLMs) have a significant user base and are gaining increasing interest and impact across various domains. Given their expanding influence, it is crucial to implement appropriate guardrails or controls to ensure ethical and responsible use. In this paper, we propose to automate the evaluation of the knowledge stored in LLMs. This is achieved by generating datasets tailored for this specific purpose, in any selected domain. Our approach consists of four major steps: (i) extraction of relevant entities; (ii) gathering of domain properties; (iii) dataset generation; and (iv) model evaluation. In order to materialize this vision, tools and resources were experimented for entity linking, knowledge acquisition, classification and prompt generation, yielding valuable insights and lessons. The generation of datasets for domain specific model evaluation has successfully proved that the approach can be a future tool for evaluating and moving LLMs “black-boxes” to human-interpretable knowledge bases. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#1788 - Luo 2024
REASONING ON GRAPHS: FAITHFUL AND INTERPRETABLE LARGE LANGUAGE MODEL REASONING

Luo, L.; Li, Y. F.; Haffari, G.; Pan, S.

12th International Conference on Learning Representations, ICLR 2024 2024;():

International Conference on Learning Representations, ICLR 2024

Ref ID: 4606

Large language models (LLMs) have demonstrated impressive reasoning abilities in complex tasks. However, they lack up-to-date knowledge and experience hallucinations during reasoning, which can lead to incorrect reasoning processes and diminish their performance and trustworthiness. Knowledge graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. Nevertheless, existing KG-based LLM reasoning methods only treat KGs as factual knowledge bases and overlook the importance of their structural information for reasoning. In this paper, we propose a novel method called reasoning on graphs (RoG) that synergizes LLMs with KGs to enable faithful and interpretable reasoning. Specifically, we present a planning-retrieval-reasoning framework, where RoG first generates relation paths grounded by KGs as faithful plans. These plans are then used to retrieve valid reasoning paths from the KGs for LLMs to conduct faithful reasoning. Furthermore, RoG not only distills knowledge from KGs to improve the reasoning ability of LLMs through training but also allows seamless integration with any arbitrary LLMs during inference. Extensive experiments on two benchmark KGQA datasets demonstrate that RoG achieves state-of-the-art performance on KG reasoning tasks and generates faithful and interpretable reasoning results. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1926 - Luo 2023
Systematic Assessment of Factual Knowledge in Large Language Models

Luo, L.; Vu, T. T.; Phung, D.; Haffari, G.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():13272-13286

Association for Computational Linguistics (ACL) 2023

Ref ID: 5082

Previous studies have relied on existing question-answering benchmarks to evaluate the knowledge stored in large language models (LLMs). However, this approach has limitations regarding factual knowledge coverage, as it mostly focuses on generic domains which may overlap with the pretraining data. This paper proposes a framework to systematically assess the factual knowledge of LLMs by leveraging knowledge graphs (KGs). Our framework automatically generates a set of questions and expected answers from the facts stored in a given KG, and then evaluates the accuracy of LLMs in answering these questions. We systematically evaluate the state-of-the-art LLMs with KGs in generic and specific domains. The experiment shows that ChatGPT is consistently the top performer across all domains. We also find that LLMs performance depends on the instruction finetuning, domain and question complexity and is prone to adversarial context. © 2023 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#3258 - Luo 2023
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning

Luo, Linhao; Ju, Jiaxin; Xiong, Bo; Li, Yuan-Fang; Haffari, Gholamreza; Pan, Shirui

arXiv 2023;():

2023

Ref ID: 7829

Logical rules are essential for uncovering the logical connections between relations, which could improve reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, the ranked rules can be used to conduct reasoning over KGs. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3930 - Luo 2023
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

Luo, Man; Kumbhar, Shrinidhi; shen, Ming; Parmar, Mihir; Varshney, Neeraj; Banerjee, Pratyay; Aditya, Somak; Baral, Chitta

arXiv 2023;():

2023

Ref ID: 7860

Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non-trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. Consequently, there's a growing interest in using LLMs for logical reasoning via natural language. This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning. To offer a thorough analysis, we have compiled a benchmark titled LogiGLUE. This includes 24 varied datasets encompassing deductive, abductive, and inductive reasoning. Utilizing LogiGLUE as a foundation, we have trained an instruction fine-tuned language model, resulting in LogiT5. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model across the different logical reasoning categories. We also assess various LLMs using LogiGLUE, and the findings indicate that LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We aim to shed light on the capabilities and potential pathways for enhancing logical reasoning proficiency in LLMs, paving the way for more advanced and nuanced developments in this critical field.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3114 - Luo 2024
Bridging Gaps in Content and Knowledge for Multimodal Entity Linking

Luo, Pengfei; Xu, Tong; Liu, Che; Zhang, Suojuan; Xu, Linli; Li, Minglei; Chen, Enhong

Proceedings of the 32nd ACM International Conference on Multimedia 2024;():9311–9320

Melbourne VIC, Australia Association for Computing Machinery 2024

DOI: 10.1145/3664647.3681661 · Ref ID: 7228

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1488 - Luo 2024
KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Luo, X.; Sun, Z.; Zhao, J.; Zhao, Z.; Hu, W.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():7146-7159

Association for Computational Linguistics (ACL) 2024

Ref ID: 4482

Parameter-efficient finetuning (PEFT) is a key technique for adapting large language models (LLMs) to downstream tasks. In this paper, we study leveraging knowledge graph embeddings to improve the effectiveness of PEFT. We propose a knowledgeable adaptation method called KnowLA. It inserts an adaptation layer into an LLM to integrate the embeddings of entities appearing in the input text. The adaptation layer is trained in combination with LoRA on instruction data. Experiments on six benchmarks with two popular LLMs and three knowledge graphs demonstrate the effectiveness and robustness of KnowLA. We show that KnowLA can help activate the relevant parameterized knowledge in an LLM to answer a question without changing its parameters or input prompts. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1084 - Lv 2024
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models

Lv, Q.; Wang, J.; Chen, H.; Li, B.; Zhang, Y.; Wu, F.

Proceedings of Machine Learning Research 2024;235():33594-33623

ML Research Press 2024

Ref ID: 4367

Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM)-which enhances models with up-to-date knowledge-emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel COarse-to-Fine highlighTing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: recaller, scorer, and selector. First, recaller applies a knowledge graph to extract potential key entities in a given context. Second, scorer measures the importance of each entity by calculating its contextual weight. Finally, selector selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on the knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over 30% in the F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering. Copyright 2024 by the author(s)

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#481 - Lysyuk 2024
Konstruktor: A Strong Baseline for Simple Knowledge Graph Question Answering

Lysyuk, M.; Salnikov, M.; Braslavski, P.; Panchenko, A.

2nd International Conference on Engineering Manufacture (EM) 2024;():107-118

Porto, PORTUGAL Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-70242-6_11 · Ref ID: 2987

While being one of the most popular question types, simple questions such as "Who is the author of Cinderella?", are still not completely solved. Surprisingly, even most powerful modern Large Language Models (LLMs) are prone to errors when dealing with such questions, especially when dealing with rare entities. At the same time, as an answer may be one hop away from the question entity, one can try to develop a method that uses structured knowledge graphs (KGs) to answer such questions. In this paper, we introduce Konstruktor -- an efficient and robust approach that breaks down the problem into three steps: (i) entity extraction and entity linking, (ii) relation prediction, and (iii) querying the knowledge graph. Our approach integrates language models and knowledge graphs, exploiting the power of the former and the interpretability of the latter. We experiment with two named entity recognition and entity linking methods and several relation detection techniques. We show that for relation detection, the most challenging step of the workflow, a combination of relation classification/generation and ranking outperforms other methods. On four datasets, we report the strong performance of Konstruktor.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3473 - Lyu 2024
GP-GPT: Large Language Model for Gene-Phenotype Mapping

Lyu, Yanjun; Wu, Zihao; Zhang, Lu; Zhang, Jing; Li, Yiwei; Ruan, Wei; Liu, Zhengliang; Yu, Xiaowei; Cao, Chao; Chen, Tong; Chen, Minheng; Zhuang, Yan; Li, Xiang; Liu, Rongjie; Huang, Chao; Li, Wentao; Liu, Tianming; Zhu, Dajiang

arXiv 2024;():

2024

Ref ID: 8600

Pre-trained large language models(LLMs) have attracted increasing attention in biomedical domains due to their success in natural language processing. However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis. Our model is fine-tuned in two stages on a comprehensive corpus composed of over 3,000,000 terms in genomics, proteomics, and medical genetics, derived from multiple large-scale validated datasets and scientific publications. GP-GPT demonstrates proficiency in accurately retrieving medical genetics information and performing common genomics analysis tasks, such as genomics information retrieval and relationship determination. Comparative experiments across domain-specific tasks reveal that GP-GPT outperforms state-of-the-art LLMs, including Llama2, Llama3 and GPT-4. These results highlight GP-GPT's potential to enhance genetic disease relation research and facilitate accurate and efficient analysis in the fields of genomics and medical genetics. Our investigation demonstrated the subtle changes of bio-factor entities' representations in the GP-GPT, which suggested the opportunities for the application of LLMs to advancing gene-phenotype research.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3841 - Lyu 2024
Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation

Lyu, Yuanjie; Niu, Zihan; Xie, Zheyong; Zhang, Chao; Xu, Tong; Wang, Yang; Chen, Enhong

arXiv 2024;():

2024

Ref ID: 8410

Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in LLM generation, inputting the entire document may introduce off-topic information, causing the model to deviate from the central topic and affecting the relevance of the generated content. To address these issues, we propose the Retrieve-Plan-Generation (RPG) framework. RPG generates plan tokens to guide subsequent generation in the plan stage. In the answer stage, the model selects relevant fine-grained paragraphs based on the plan and uses them for further answer generation. This plan-answer process is repeated iteratively until completion, enhancing generation relevance by focusing on specific topics. To implement this framework efficiently, we utilize a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering. We comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of our approach.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1460 - Ma 2023
KAPALM: Knowledge grAPh enhAnced Language Model for Fake News Detection

Ma, J.; Chen, C.; Hou, C.; Yuan, X.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():3999-4009

Association for Computational Linguistics (ACL) 2023

Ref ID: 5058

Social media has not only facilitated news consumption, but also led to the wide spread of fake news. Because news articles in social media are usually condensed and full of knowledge entities, existing methods of fake news detection use external entity knowledge to improve the effectiveness. However, the majority of these methods focus on news entity information and ignore the structured relation knowledge among news entities. To address this issue, in this work, we propose a Knowledge grAPh enhAnced Language Model (KAPALM) which is a novel model that fuses coarse- and fine-grained representations of entity knowledge from Knowledge Graphs (KGs). Firstly, we identify entities in news content and link them to entities in KGs. Then, a subgraph of KGs is extracted to provide structured relation knowledge of entities in KGs and fed into a graph neural network to obtain the coarse-grained knowledge representation. This subgraph is pruned to provide fine-grained knowledge and fed into the attentive graph pooling layer. Finally, we integrate the coarse- and fine-grained entity knowledge representations with the representation of news content for fake news detection. The experimental results on two benchmark datasets show that our method is superior to state-of-the-art baselines in the full-scale setting. In addition, our model is competitive in the few-shot setting. © 2023 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#731 - Ma 2024
A review of graph neural networks and pretrained language models for knowledge graph reasoning

Ma, J. T.; Liu, B.; Li, K. L.; Li, C. L.; Zhang, F.; Luo, X. Y.; Qiao, Y. Q.

Neurocomputing 2024;609():20

2024

DOI: 10.1016/j.neucom.2024.128490 · Ref ID: 2997

Knowledge Graph (KG) stores human knowledge facts in an intuitive graphical structure but faces challenges such as incomplete construction or inability to handle new knowledge. Knowledge Graph Reasoning (KGR) can make KGs more accurate, complete, and trustworthy to support various artificial intelligence applications better. Currently, the popular KGR methods are based on graph neural networks (GNNs). Recent studies have shown that hybrid logic rules and synergized pre-trained language models (PLMs) can enhance the GNN-based KGR methods. These methods mainly focus on data sparsity, insufficient knowledge evolution patterns, multi- modal fusion, and few-shot reasoning. Although many studies have been conducted, there are still few review papers that comprehensively summarize and explore KGR methods related to GNNs, logic rules, and PLMs. Therefore, this paper provides a comprehensive review of GNNs and PLMs for KGR based on a large number of high-quality papers. To present a clear overview of KGR, we propose a general framework. Specifically, we first introduce the KG preparation. Then we provide an overview of KGR methods, in which we categorize KGR methods into GNNs-based, logic rules-enhanced, and pre-trained language models-enhanced KGR methods. Furthermore, we also compare and analyze the GNN-based KGR methods in two scenarios. Moreover, we also present the application of KGR in different fields. Finally, we discuss the current challenges and future research directions for KGR.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#628 - Ma 2023
Ontology-Based BERT Model for Automated Information Extraction from Geological Hazard Reports

Ma, K.; Tian, M.; Tan, Y. J.; Qiu, Q. J.; Xie, Z.; Huang, R.

J. Earth Sci. 2023;34(5):1390-1405

2023

DOI: 10.1007/s12583-022-1724-z · Ref ID: 3301

Geological knowledge can provide support for knowledge discovery, knowledge inference and mineralization predictions of geological big data. Entity identification and relationship extraction from geological data description text are the key links for constructing knowledge graphs. Given the lack of publicly annotated datasets in the geology domain, this paper illustrates the construction process of geological entity datasets, defines the types of entities and interconceptual relationships by using the geological entity concept system, and completes the construction of the geological corpus. To address the shortcomings of existing language models (such as Word2vec and Glove) that cannot solve polysemous words and have a poor ability to fuse contexts, we propose a geological named entity recognition and relationship extraction model jointly with Bidirectional Encoder Representation from Transformers (BERT) pretrained language model. To effectively represent the text features, we construct a BERT- bidirectional gated recurrent unit network (BiGRU)-conditional random field (CRF)-based architecture to extract the named entities and the BERT-BiGRU-Attention-based architecture to extract the entity relations. The results show that the F1-score of the BERT-BiGRU-CRF named entity recognition model is 0.91 and the F1-score of the BERT-BiGRU-Attention relationship extraction model is 0.84, which are significant performance improvements when compared to classic language models (e.g., word2vec and Embedding from Language Models (ELMo)).

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1022 - Ma 2023
BERT-based Question Answering using Knowledge Graph Embeddings in Nuclear Power Domain

Ma, Z.; Yan, K.; Wang, H.

Proceedings of the 2023 26th International Conference on Computer Supported Cooperative Work in Design, CSCWD 2023 2023;():267-272

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/CSCWD57460.2023.10152692 · Ref ID: 5266

In order to improve the resource utilization rate of existing nuclear power data and promote workers to efficiently obtain the operation information of nuclear power units and assist them in fault diagnosis and maintenance decision-making, this paper constructs a knowledge graph question answering (KGQA) dataset in the field of nuclear power. The BEm-KGQA model based on the pre-trained language model and knowledge graph embedding method was proposed. Our model learns the embedded representation of the knowledge graph through BERT and fine-tunes the BERT model. In the question embedding stage, it learns the embedded representation of the question based on the fine-tuned BERT model. Through experiments, we demonstrate the effectiveness of the method over other models. In addition, this paper implements a nuclear power question answering system. Based on the question answering system, employees can learn about unit information and efficiently obtain information on unusual operating events of nuclear power. © 2023 IEEE.

Kwesi voted
Ishan voted
Final decision
What was the agreed final decision?

#3535 - Madhusudhana 2024
Integrating Cognitive AI with Generative Models for Enhanced Question Answering in Skill-based Learning

Madhusudhana, Rochan H.; Dass, Rahul K.; Luu, Jeanette; Goel, Ashok K.

arXiv 2024;():

2024

Ref ID: 8489

In online learning, the ability to provide quick and accurate feedback to learners is crucial. In skill-based learning, learners need to understand the underlying concepts and mechanisms of a skill to be able to apply it effectively. While videos are a common tool in online learning, they cannot comprehend or assess the skills being taught. Additionally, while Generative AI methods are effective in searching and retrieving answers from a text corpus, it remains unclear whether these methods exhibit any true understanding. This limits their ability to provide explanations of skills or help with problem-solving. This paper proposes a novel approach that merges Cognitive AI and Generative AI to address these challenges. We employ a structured knowledge representation, the TMK (Task-Method-Knowledge) model, to encode skills taught in an online Knowledge-based AI course. Leveraging techniques such as Large Language Models, Chain-of-Thought, and Iterative Refinement, we outline a framework for generating reasoned explanations in response to learners' questions about skills.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2741 - Mainetti 2015
A novel rule-based semantic architecture for IoT building automation systems

Mainetti, L.; Mighali, V.; Patrono, L.; Rametta, P.

2015 23rd International Conference on Software, Telecommunications and Computer Networks (SoftCOM) 2015;():124-131

2015

DOI: 10.1109/SOFTCOM.2015.7314063 · Ref ID: 6974

The ever growing number of smart devices connected to the Internet of Things is giving users the chance to sense data from surrounding environment and act upon it. However, interpreting raw data coming from heterogeneous sensors and applying control algorithms to actuators is not a simple task for the common end-user who wants to create applications for smart environments. For these reasons, this work deals with the definition of a novel rule-based semantic architecture for the implementation of building automation applications in an IoT context. Sensor data are abstracted at a high semantic level related to the properties they are associated to and interactions with actuators are driven by high-level desired actions. Applications have the form of an Event-Condition-Action (ECA) rule and the layered architecture separates high-level semantic reasoning aspects from low-level execution details. The proposed architecture is also compared with main state-of-the-art solutions and some suitable technologies for its implementation are suggested.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#125 - Malaviya 2020
Commonsense Knowledge Base Completion with Structural and Semantic Context

Malaviya, C.; Bhagavatula, C.; Bosselut, A.; Choi, Y.; Assoc Advancement Artificial, Intelligence

34th AAAI Conference on Artificial Intelligence / 32nd Innovative Applications of Artificial Intelligence Conference / 10th AAAI Symposium on Educational Advances in Artificial Intelligence 2020;34():2925-2933

New York, NY Assoc Advancement Artificial Intelligence 2020

Ref ID: 3109

Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free formtext to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs 18x more nodes in ATOMIC compared to Freebase (FB15K237)). Importantly, this implies significantly sparser graph structures a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes. in this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre -trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.'

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#2760 - Malik 2017
Ontology based context aware model

Malik, S.; Jain, S.

2017 International Conference on Computational Intelligence in Data Science(ICCIDS) 2017;():1-6

2017

DOI: 10.1109/ICCIDS.2017.8272632 · Ref ID: 6297

Top-Down approach is followed while developing a Context Model which means first application and then its functionality will be defined. Once this is setup, required context models are developed. In this paper, we have presented the comparison of existing context ontologies on the basis of different parameters and also provided an ontology based context model which defines the generic concepts and also provides the extensibility for adding domain-specific ontology. This model will use Extended Hierarchical Censored Production rule (EHCPR) which is a scheme for representing knowledge.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#825 - Maratsi 2024
Towards Cross-Domain Linking of Data: A Semantic Mapping of Cultural Heritage Ontologies

Maratsi, M. I.; Ahmed, U.; Alexopoulos, C.; Charalabidis, Y.; Polini, A.

25th Annual International Conference on Digital Government Research (DGO) - Internet of Beings - Transforming Public Governance 2024;():165-176

Taipei, TAIWAN Assoc Computing Machinery 2024

DOI: 10.1145/3657054.3657077 · Ref ID: 3751

The Linked Open Vocabularies (LOV) registry, designed with the Linked Data principles at core, provides an environment suitable for research which targets domain-specific, but also potentially reusable, information representation. The main purpose of this study is to follow the recommendations pertaining to the utilisation of LOV as a basis for experimentation in order to examine how information within the Cultural Heritage (CH) domain can be improved in terms of reusability and interoperability. The present lack of cross-domain knowledge transfer forms the motivation behind this study, with the aim of facilitating the transition from conventional, domain-specific knowledge representation to reusable and semantically interoperable information. The methodology of this study involves the manual semantic mapping of elements from 12 vocabularies in the LOV registry, reinforced by a small-scale experiment using contemporary large language models (LLMs), particularly GPT, for a preliminary assessment of the mapping process. The findings revealed several key aspects to consider regarding the alignment of semantically adjacent vocabulary elements in the CH domain and beyond, emphasising the potential unveiled by linking domain-focused schemata to standardised, established ones while preserving the conceptual hierarchies inherent to each individual knowledge domain. The contribution of this research pertains to the vision of linking data across different domains by initiating the alignment among representation schemata in CH, with the ultimate aim to expand beyond the boundaries of the in-word knowledge domain, while employing combinatory methodological approaches of technological means and human expertise to facilitate this process.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3356 - Marjanović 2024
DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models

Marjanović, Sara Vera; Yu, Haeun; Atanasova, Pepa; Maistro, Maria; Lioma, Christina; Augenstein, Isabelle

arXiv 2024;():

2024

Ref ID: 8480

Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. However, conflicting knowledge can be present in the LM's parameters, termed intra-memory conflict, which can affect a model's propensity to accept contextual knowledge. To study the effect of intra-memory conflict on an LM's ability to accept relevant context, we utilize two knowledge conflict measures and a novel dataset containing inherently conflicting data, DynamicQA. This dataset includes facts with a temporal dynamic nature where facts can change over time and disputable dynamic facts, which can change depending on the viewpoint. DynamicQA is the first to include real-world knowledge conflicts and provide context to study the link between the different types of knowledge conflicts. We also evaluate several measures on their ability to reflect the presence of intra-memory conflict: semantic entropy and a novel coherent persuasion score. With our extensive experiments, we verify that LMs exhibit a greater degree of intra-memory conflict with dynamic facts compared to facts that have a single truth value. Furthermore, we reveal that facts with intra-memory conflict are harder to update with context, suggesting that retrieval-augmented generation will struggle with the most commonly adapted facts.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#620 - Martin-Moncunill 2022
On Contrasting YAGO with GPT-J: An Experiment for Person-Related Attributes

Martin-Moncunill, D.; Sicilia, M. A.; González, L.; Rodríguez, D.

4th Iberoamerican Conference and 3rd Indo-American Conference Knowledge Graphs and Semantic Web Conference (KGSWC) 2022;1686():234-245

Madrid, SPAIN Springer International Publishing Ag 2022

DOI: 10.1007/978-3-031-21422-6_17 · Ref ID: 3351

Language models (LMs) trained or large text corpora have demonstrated their superior performance in different language related tasks in the last years. These models automatically implicitly incorporate factual knowledge that can be used to complement existing Knowledge Graphs (KGs) that in most cases are structured from human curated databases. Here we report an experiment that attempts to gain insights about the extent to which LMs can generate factual information as that present in KGs. Concretely, we have tested such process using the English Wikipedia subset of YAGO and the GPT-J model for attributes related to individuals. Results show that the generation of correct factual information depends on the generation parameters of the model and are unevenly balanced across diverse individuals. Further, the LM can be used to populate further factual information, but it requires intermediate parsing to correctly map to KG attributes.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1912 - Martinez 2023
Study-Buddy: A Knowledge Graph-Powered Learning Companion for School Students

Martinez, F.; Collarana, D.; Calvaresi, D.; Arispe, M.; Florida, C.; Calbimonte, J. P.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13998 LNCS():133-137

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-43458-7_25 · Ref ID: 5157

Large Language Models (LLMs) have the potential to substantially improve educational tools for students. However, they face limitations, including factual accuracy, personalization, and the lack of control over the sources of information. This paper presents Study-Buddy, a prototype of a conversational AI assistant for school students to address the above-mentioned limitations. Study-Buddy embodies an AI assistant based on a knowledge graph, LLMs models, and computational persuasion. It is designed to support educational campaigns as a hybrid AI solution. The demonstrator showcases interactions with Study-Buddy and the crucial role of the Knowledge Graph for the bot to present the appropriate activities to the students. A video demonstrating the main features of Study-Buddy is available at: https://youtu.be/DHPTsN1RI9o. © The Author(s), under exclusive license to Springer Nature Switzerland AG. 2023.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1734 - Maushagen 2024
Populating CSV Files from Unstructured Text with LLMs for KG Generation with RML

Maushagen, J.; Sepehri, S.; Sanctorum, A.; Vanhaecke, T.; De Troyer, O.; Debruyne, C.

CEUR Workshop Proceedings 2024;3759():

CEUR-WS 2024

Ref ID: 4249

We report on an exploratory study using Large Language Models (LLMs) to generate Comma-Separated Values (CSV) files, which are subsequently transformed into Resource Description Framework (RDF) using the RDF Mapping Language (RML).Prior studies have shown that LLMs sometimes have problems generating valid and well-formed RDF from unstructured texts, i.e., issues with RDF, not the contents.We wanted to test whether the generation of CSV led to fewer issues and whether this would be a viable option for allowing domain experts to be actively part of the Knowledge Graph (KG) population process by allowing them to use familiar tools.We have built a prototype illustrating this idea, and the results seem promising for further study.The initial prototype uses zero-shot training and is built on GPT-4.The prototype takes the unstructured text and the CSV file's structure as input and uses the latter to generate prompts to fill in the cells' values.Future work includes analyzing the effect of different prompting strategies.The limitation, however, is that such an approach only works for projects where domain experts work with spreadsheets for pre-existing mappings. © 2024 Copyright for this paper by its authors.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#763 - Mavromatis 2024
SemPool: Simple, Robust, and Interpretable KG Pooling for Enhancing Language Models

Mavromatis, C.; Karypis, P.; Karypis, G.

28th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2024;14648():154-166

Taipei, TAIWAN Springer-Verlag Singapore Pte Ltd 2024

DOI: 10.1007/978-981-97-2238-9_12 · Ref ID: 3346

Knowledge Graph (KG) powered question answering (QA) performs complex reasoning over language semantics as well as knowledge facts. Graph Neural Networks (GNNs) learn to aggregate information from the underlying KG, which is combined with Language Models (LMs) for effective reasoning with the given question. However, GNN-based methods for QA rely on the graph information of the candidate answer nodes, which limits their effectiveness in more challenging settings where critical answer information is not included in the KG. We propose a simple graph pooling approach that learns useful semantics of the KG that can aid the LM's reasoning and that its effectiveness is robust under graph perturbations. Our method, termed SemPool, represents KG facts with pre-trained LMs, learns to aggregate their semantic information, and fuses it at different layers of the LM. Our experimental results show that SemPool outperforms state-of-the-art GNN-based methods by 2.27% accuracy points on average when answer information is missing from the KG. In addition, SemPool offers interpretability on what type of graph information is fused at different LM layers.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2403 - Mehrabi 2013
Event Causality Identification Using Conditional Random Field in Geriatric Care Domain

Mehrabi, S.; Krishnan, A.; Tinsley, E.; Sligh, J.; Crohn, N.; Bush, H.; Depasquale, J.; Bandos, J.; Palakal, M.

2013 12th International Conference on Machine Learning and Applications 2013;1():339-343

2013

DOI: 10.1109/ICMLA.2013.69 · Ref ID: 6609

Event extraction is a key step in many text-mining applications such as question-answering, information extraction and summarization systems. In this study we used conditional random field (CRF) to extract causal events from PubMed articles related to Geriatric care. Abstracts of geriatric care domain were manually reviewed and categorized into 42 different sub domains. There are a total of 19, 677 sentences in the collected abstracts from PubMed, out of which 2, 856 sentences were selected and manually annotated with cause and effect events. The data set was then divided into training (2, 520), validation (252) and test (84) sentence sets. Features such as tokens, token categories, affixes, part of speech and shallow parser were used as inputs to the CRF model. A window of features before and after each token was used to determine its causal event label using CRF. A window of four features had the best performance with 84.6% precision, 87% recall, 85% and F-measure.

Mike voted
brandon voted
Final decision
What was the agreed final decision?

#2367 - Mei 2009
An E-negotiation Model Based on Multi-agent and Ontology

Mei, P. Q.; Hong, Z.; Cun, C. Y.; Qin, P. X.

2009 International Conference on Computational Intelligence and Natural Computing 2009;2():107-110

2009

DOI: 10.1109/CINC.2009.263 · Ref ID: 6379

In the e-commerce environments, multi-agent system is often used in automated negotiation. But when agents communicate they do not necessarily use the same vocabulary or ontology. If they want to interact successfully they must find correspondences between the terms used in their Ontology. It is obvious that negotiating agent architectures have not been addressed sufficiently. Towards this end, this paper present a novel agent construction model that enables agents communicate in the semantic web. Semantic web use Ontology to describe the negotiation protocol, which will enable agent gain the necessary knowledge of the protocol from the market. We demonstrate how the model allows us to accomplish these negotiation architectures.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#534 - Meyer 2023
LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT

Meyer, L. P.; Stadler, C.; Frey, J.; Radtke, N.; Junghanns, K.; Meissner, R.; Dziwis, G.; Bulert, K.; Martin, M.

1st Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow (AIDRST) 2023;():103-115

Leipzig, GERMANY Springer Vieweg Verlag 2023

DOI: 10.1007/978-3-658-43705-3_8 · Ref ID: 3006

Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, web technologies, existing models and vocabularies, rule sets, logic, as well as best practices. It also demands a significant amount of work. Considering the advancements in large language models (LLMs) and their interfaces and applications in recent years, we have conducted comprehensive experiments with ChatGPT to explore its potential in supporting KGE. In this paper, we present a selection of these experiments and their results to demonstrate how ChatGPT can assist us in the development and management of KGs.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2891 - Miguelañez 2011
Semantic Knowledge-Based Framework to Improve the Situation Awareness of Autonomous Underwater Vehicles

Miguelañez, E.; Patrón, P.; Brown, K. E.; Petillot, Y. R.; Lane, D. M.

IEEE Transactions on Knowledge and Data Engineering 2011;23(5):759-773

2011

DOI: 10.1109/TKDE.2010.46 · Ref ID: 6097

This paper proposes a semantic world model framework for hierarchical distributed representation of knowledge in autonomous underwater systems. This framework aims to provide a more capable and holistic system, involving semantic interoperability among all involved information sources. This will enhance interoperability, independence of operation, and situation awareness of the embedded service-oriented agents for autonomous platforms. The results obtained specifically affect the mission flexibility, robustness, and autonomy. The presented framework makes use of the idea that heterogeneous real-world data of very different type must be processed by (and run through) several different layers, to be finally available in a suited format and at the right place to be accessible by high-level decision-making agents. In this sense, the presented approach shows how to abstract away from the raw real-world data step by step by means of semantic technologies. The paper concludes by demonstrating the benefits of the framework in a real scenario. A hardware fault is simulated in a REMUS 100 AUV while performing a mission. This triggers a knowledge exchange between the status monitoring agent and the adaptive mission planner embedded agent. By using the proposed framework, both services can interchange information while remaining domain independent during their interaction with the platform. The results of this paper are readily applicable to land and air robotics.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3035 - Milea 2012
tOWL: A Temporal Web Ontology Language

Milea, V.; Frasincar, F.; Kaymak, U.

IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 2012;42(1):268-281

2012

DOI: 10.1109/TSMCB.2011.2162582 · Ref ID: 6242

Through its interoperability and reasoning capabilities, the Semantic Web opens a realm of possibilities for developing intelligent systems on the Web. The Web Ontology Language (OWL) is the most expressive standard language for modeling ontologies, the cornerstone of the Semantic Web. However, up until now, no standard way of expressing time and time-dependent information in OWL has been provided. In this paper, we present a temporal extension of the very expressive fragment SHIN(D) of the OWL Description Logic language, resulting in the temporal OWL language. Through a layered approach, we introduce three extensions: 1) concrete domains, which allow the representation of restrictions using concrete domain binary predicates; 2) temporal representation , which introduces time points, relations between time points, intervals, and Allen's 13 interval relations into the language; and 3) timeslices/fluents, which implement a perdurantist view on individuals and allow for the representation of complex temporal aspects, such as process state transitions. We illustrate the expressiveness of the newly introduced language by using an example from the financial domain.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#420 - Miller 2023
Knowledge Enhanced Deep Learning: Application to Pandemic Prediction

Miller, J. A.; Barna, N. H.; Rana, S.; Arpinar, I. B.; Liu, N. H.; Ieee

IEEE 9th International Conference on Collaboration and Internet Computing (CIC) 2023;():42-51

Atlanta, GA Ieee 2023

DOI: 10.1109/cic58953.2023.00016 · Ref ID: 3472

Deep Learning has been successfully applied to many problem domains, yet its advantages have been slow to emerge for time series forecasting. For example, in the well-known M Competitions, until recently, hybrids of traditional statistical or machine learning (e.g., gradient boosting) techniques were the top performers. With the recent architectural advances in deep learning being applied to time series forecasting, such as encoder-decoders with attention, transformers, representation learning, and graph neural networks, deep learning has begun to show its advantages. Still, in the area of pandemic prediction, there remain challenges for deep learning models: the time series is not long enough for effective training, ignorance of accumulated scientific knowledge, and interpretability of the model. Today, there is a vast amount of knowledge available that deep learning models can tap into, including Knowledge Graphs and Large Language Models fine-tuned with scientific domain knowledge. There is ongoing research examining how to utilize or inject knowledge into deep learning models. The state-of-the-art approaches are reviewed and suggestions for further work are provided. Recommendations for how this can be applied to future pandemics are given.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#230 - Mimouni 2019
Entity Embedding Analogy for Implicit Link Discovery

Mimouni, N.; Moissinac, J. C.; Vu, A. T.

16th International Extended Semantic Web Conference (ESWC) 2019;11762():126-129

Portoroz, SLOVENIA Springer International Publishing Ag 2019

DOI: 10.1007/978-3-030-32327-1_25 · Ref ID: 3245

In this work we are interested in the problem of knowledge graph (KG) incompleteness, which we propose to solve by discovering implicit triples using observed ones in the incomplete graph leveraging analogy structures deducted from a KG embedding model. We use a language modelling approach that we adapt to entities and relations. The first results show that analogical inferences in the projected vector space is relevant to a link prediction task.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1713 - Miranda-Escalada 2020
Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF eHealth 2020

Miranda-Escalada, A.; Gonzalez-Agirre, A.; Armengol-Estapé, J.; Krallinger, M.

CEUR Workshop Proceedings 2020;2696():

CEUR-WS 2020

Ref ID: 5729

Clinical coding requires the analysis and transformation of medical narratives into a structured or coded format using internationally recognized classification systems like ICD-10. These codes represent medical diagnoses and procedures. Clinical coding is critical for standardizing medical records, particularly for health information management systems used to carry out biomedical/epidemiological research studies, monitor health trends or facilitate medical billing and reimbursement. The growing amount of clinical records has prompted the search for tools that assist manual coding. Inspired by the CCMC challenge and various eHealth CLEF shared tasks, we organized the CodiEsp track. Codiesp (eHealth CLEF 2020-Multilingual Information Extraction Shared Task) represents the first effort to promote the development and evaluation of automatic clinical coding systems for medical documents in Spanish. In this context, we have published a set of resources including (i) a manually coded Gold Standard corpus with inter-coder agreement and supporting textual evidence statements, (ii) an additional large collection of medical literature indexed with ICD-10 clinical codes and (iii) a machine translated corpus to enable multilingual approaches and testing of previous strategies developed for data in English. We have received a total of 168 runs submitted by 22 teams from 11 countries for at least one of our three sub-tracks: CodiEsp-D (Diagnosis Coding), CodiEsp-P (Procedure Coding) and CodiEsp-X (Explainable AI). Despite the considerable complexity of this task, which can be viewed as a hierarchical multi-label classification problem using ICD-10 codes as labels and documents as input, participants obtained very promising results, specially for codes that were well covered by the training data. Participants examined a variety of strategies, specifically deep learning approaches, pre-trained language models and word embeddings (BERT, BETO, FastText, etc.), as well as NER, string lookup and knowledge graph approaches. CodiEsp Corpus: https://zenodo.org/record/3837305. Copyright © 2020 for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3682 - Mitra 2024
LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge

Mitra, Shaswata; Neupane, Subash; Chakraborty, Trisha; Mittal, Sudip; Piplai, Aritran; Gaur, Manas; Rahimi, Shahram

arXiv 2024;():

2024

Ref ID: 8040

Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat databases and customize them manually to suit a particular organization's needs. These analysts also depend on internal repositories, which act as private local knowledge database for an organization. Credible cyber intelligence, critical operational details, and relevant organizational information are all stored in these local knowledge databases. Analysts undertake a labor intensive task utilizing these global and local knowledge databases to manually create organization's unique threat response and mitigation strategies. Recently, Large Language Models (LLMs) have shown the capability to efficiently process large diverse knowledge sources. We leverage this ability to process global and local knowledge databases to automate the generation of organization-specific threat intelligence. In this work, we present LOCALINTEL, a novel automated knowledge contextualization system that, upon prompting, retrieves threat reports from the global threat repositories and uses its local knowledge database to contextualize them for a specific organization. LOCALINTEL comprises of three key phases: global threat intelligence retrieval, local knowledge retrieval, and contextualized completion generation. The former retrieves intelligence from global threat repositories, while the second retrieves pertinent knowledge from the local knowledge database. Finally, the fusion of these knowledge sources is orchestrated through a generator to produce a contextualized completion.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3444 - Mohammadjafari 2024
From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems

Mohammadjafari, Ali; Maida, Anthony S.; Gottumukkala, Raju

arXiv 2024;():

2024

Ref ID: 8646

Since the onset of LLMs, translating natural language queries to structured SQL commands is assuming increasing. Unlike the previous reviews, this survey provides a comprehensive study of the evolution of LLM-based text-to-SQL systems, from early rule-based models to advanced LLM approaches, and how LLMs impacted this field. We discuss benchmarks, evaluation methods and evaluation metrics. Also, we uniquely study the role of integration of knowledge graphs for better contextual accuracy and schema linking in these systems. The current techniques fall into two categories: in-context learning of corpus and fine-tuning, which then leads to approaches such as zero-shot, few-shot learning from the end, and data augmentation. Finally, we highlight key challenges such as computational efficiency, model robustness, and data privacy with perspectives toward their development and improvements in potential areas for future of LLM-based text-to-SQL system.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1080 - Mohsenimofidi 2024
Classifying User Intent for Effective Prompt Engineering: A Case of a Chatbot for Startup Teams

Mohsenimofidi, S.; Raghavendra Prasad, A. S.; Zahid, A.; Rafiq, U.; Wang, X.; Attal, M. I.

Generative AI for Effective Softw. Development 2024;():317-329

Springer Nature 2024

DOI: 10.1007/978-3-031-55642-5_15 · Ref ID: 4232

Prompt engineering plays a pivotal role in effective interaction with large language models (LLMs), including ChatGPT. Understanding user intent behind interactions with LLMs is an important part of prompt construction to elicit relevant and meaningful responses from them. Existing literature sheds little light on this aspect of prompt engineering. Our study seeks to address this knowledge gap. Using the example of building a chatbot for startup teams to obtain better responses from ChatGPT, we demonstrate a feasible way of classifying user intent automatically using ChatGPT itself. Our study contributes to a rapidly increasing body of knowledge of prompt engineering for LLMs. Even though the application domain of our approach is startups, it can be adapted to support effective prompt engineering in various other application domains as well. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#776 - Moiseev 2022
SKILL: Structured Knowledge Infusion for Large Language Models

Moiseev, F.; Dong, Z.; Alfonseca, E.; Jaggi, M.; Assoc Computat, Linguist

Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():1581-1588

Seattle, WA Assoc Computational Linguistics-Acl 2022

Ref ID: 3168

Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks. However, it is largely unexplored whether they can better internalize knowledge from a structured data, such as a knowledge graph, or from text. In this work, we propose a method to infuse structured knowledge into LLMs, by directly training T5 models on factual triples of knowledge graphs (KGs). We show that models pre-trained on Wikidata KG with our method outperform the T5 baselines on FreebaseQA and WikiHop, as well as the Wikidata-answerable subset of TriviaQA and NaturalQuestions. The models pretrained on factual triples compare competitively with the ones on natural language sentences that contain the same knowledge. Trained on a smaller size KG, WikiMovies, we saw 3x improvement of exact match score on MetaQA task compared to T5 baseline. The proposed method has an advantage that no alignment between the knowledge graph and text corpus is required in curating training data. This makes our method particularly useful when working with industry-scale knowledge graphs.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1691 - Moses 2024
NLPeople at TextGraphs-17 Shared Task: Chain of Thought Questioning to Elicit Decompositional Reasoning

Moses, M.; Kuruvanthodi, V.; Elkaref, M.; Tanaka, S.; Barry, J.; De Mel, G.; Watson, C. D.

TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():142-148

Association for Computational Linguistics (ACL) 2024

Ref ID: 4247

This paper presents the approach of the NLPeople team for the Text-Graph Representations for KGQA Shared Task at TextGraphs-17 (Sakhovskiy et al., 2024). The task involved selecting an answer for a given question from a list of candidate entities. We show that prompting Large Language models (LLMs) to break down a natural language question into a series of sub-questions, allows models to understand complex questions. The LLMs arrive at the final answer by answering the intermediate questions using their internal knowledge without needing additional context. Our approach to the task uses an ensemble of prompting strategies to guide how LLMs interpret various types of questions. Our submission achieves an F1 score of 85.90, ranking 1st among the other participants in the task. © 2024 Association for Computational Linguistics.

Kwesi voted
yuexi voted
Final decision
What was the agreed final decision?

#1130 - Mousavi 2024
Construction of Paired Knowledge Graph - Text Datasets Informed by Cyclic Evaluation

Mousavi, A.; Zhan, X.; Bai, H.; Shi, P.; Rekatsinas, T.; Han, B.; Li, Y.; Pound, J.; Susskind, J.; Schluter, N.; Ilyas, I. F.; Jaitly, N.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():3782-3803

European Language Resources Association (ELRA) 2024

Ref ID: 4535

Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find that noisier datasets do indeed lead to more hallucination. We argue that the ability of forward and reverse models trained on a dataset to cyclically regenerate source KG or text is a proxy for the equivalence between the KG and the text in the dataset. Using cyclic evaluation we find that manually created WebNLG is much better than automatically created TeKGen and T-REx. Informed by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation. We also construct two synthetic datasets using large language models (LLMs), and observe that these are conducive to models that perform significantly well on cyclic generation of text, but less so on cyclic generation of KGs, probably because of a lack of a consistent underlying ontology. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3354 - Mousavi 2024
DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMs

Mousavi, Seyed Mahed; Alghisi, Simone; Riccardi, Giuseppe

arXiv 2024;():

2024

Ref ID: 8232

LLMs acquire knowledge from massive data snapshots collected at different timestamps. Their knowledge is then commonly evaluated using static benchmarks. However, factual knowledge is generally subject to time-sensitive changes, and static benchmarks cannot address those cases. We present an approach to dynamically evaluate the knowledge in LLMs and their time-sensitiveness against Wikidata, a publicly available up-to-date knowledge graph. We evaluate the time-sensitive knowledge in twenty-four private and open-source LLMs, as well as the effectiveness of four editing methods in updating the outdated facts. Our results show that 1) outdatedness is a critical problem across state-of-the-art LLMs; 2) LLMs output inconsistent answers when prompted with slight variations of the question prompt; and 3) the performance of the state-of-the-art knowledge editing algorithms is very limited, as they can not reduce the cases of outdatedness and output inconsistency.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#3833 - Mruthyunjaya 2023
Rethinking Language Models as Symbolic Knowledge Graphs

Mruthyunjaya, Vishwas; Pezeshkpour, Pouya; Hruschka, Estevam; Bhutani, Nikita

arXiv 2023;():

2023

Ref ID: 7817

Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric applications such as search, question answering and recommendation. As contemporary language models (LMs) trained on extensive textual data have gained prominence, researchers have extensively explored whether the parametric knowledge within these models can match up to that present in knowledge graphs. Various methodologies have indicated that enhancing the size of the model or the volume of training data enhances its capacity to retrieve symbolic knowledge, often with minimal or no human supervision. Despite these advancements, there is a void in comprehensively evaluating whether LMs can encompass the intricate topological and semantic attributes of KGs, attributes crucial for reasoning processes. In this work, we provide an exhaustive evaluation of language models of varying sizes and capabilities. We construct nine qualitative benchmarks that encompass a spectrum of attributes including symmetry, asymmetry, hierarchy, bidirectionality, compositionality, paths, entity-centricity, bias and ambiguity. Additionally, we propose novel evaluation metrics tailored for each of these attributes. Our extensive evaluation of various LMs shows that while these models exhibit considerable potential in recalling factual information, their ability to capture intricate topological and semantic traits of KGs remains significantly constrained. We note that our proposed evaluation metrics are more reliable in evaluating these abilities than the existing metrics. Lastly, some of our benchmarks challenge the common notion that larger LMs (e.g., GPT-4) universally outshine their smaller counterparts (e.g., BERT).

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#220 - Mu 2024
Enhancing Narrative Commonsense Reasoning With Multilevel Causal Knowledge

Mu, F. T.; Li, W. J.

IEEE Trans. Neural Netw. Learn. Syst. 2024;():13

2024

DOI: 10.1109/tnnls.2024.3380851 · Ref ID: 3430

Narratives is an account of the unfolding of events, along with explanations of how and why these processes and events came to be. To understand narratives, causality has been proven to be especially useful. Causality manifests itself primarily at both the event and sentence levels, offering essential insights into understanding narratives. However, previous works utilize either sentence-level or event-level causalities. In this article, we devise a two-stage approach to fully exploit both levels of causal relationships. In the first stage, by devising posttraining tasks, we inject sentence-level causalities into pretrained language models (PLMs). The causal-enhanced PLMs, which carry sentence-level causalities, can be transferred to downstream tasks. In the second stage, we utilize event causalities to further refine narrative commonsense reasoning. But, the event sparsity problem brings about the difficulty to learn event representations and capture useful causal semantics. To alleviate this problem, we break down events into multiple word components, enabling the retrieval of word-word relations between these components. And it is possible to alleviate the event sparsity problem since word-word relations capture the interplays between event components. Based on the event-level causalities and the word-level relations, we construct the hierarchical knowledge graph (KG) as knowledge ground. A KG-based reasoning process is then employed for narrative commonsense reasoning. Experimental results affirm the effectiveness of our framework.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#827 - Mühlenberg 2019
Towards information extraction from ISR reports for decision support using a two-stage learning-based approach

Mühlenberg, D.; Kuwertz, A.; Schenkel, P.; Müller, W.

24th Conference on Open Architecture/Open Business Model Net-Centric Systems and Defense Transformation 2019;11015():

Baltimore, MD Spie-Int Soc Optical Engineering 2019

DOI: 10.1117/12.2518599 · Ref ID: 3736

The main challenge of computer linguistics is to represent the meaning of text in a computer model. Statistics based methods with manually created features have been used for more than 30 years with a divide and conquer approach to mark interesting features in free text. Around 2010, deep learning concepts found their way into the text-understanding research community. Deep learning is very attractive and easy to apply but needs massive pools of annotated and high quality data from every target domain, which is generally not available especially for the military domain. When changing the application domain one needs additional or new data to adopt the language models to the new domain. To overcome the everlasting "data problem" we chose a novel two-step approach by first using formal representations of the meaning and then applying a rule-based mapping to the target domain. As an intermediate language representation, we used abstract meaning representation (AMR) and trained a general base model. This base model was then trained with additional data from the intended domains (transfer learning) evaluating the quality of the parser with a stepwise approach in which we measured the parser performance against the amount of training data. This approach answered the question of how much data we need to get the required quality when changing an application domain. The mapping of the meaning representation to the target domain model gave us more control over specifics of the domain, which are not generally representable by a machine learning approach with self-learned feature vectors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#728 - Muludi 2024
Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model

Muludi, K.; Fitria, K. M.; Triloka, J.; Sutedi

Int. J. Adv. Comput. Sci. Appl. 2024;15(3):776-785

2024

Ref ID: 3682

This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#386 - Mzwri 2023
Internet Wizard for Enhancing Open-Domain Question-Answering Chatbot Knowledge Base in Education

Mzwri, K.; Turcsányi-Szabo, M.

Appl. Sci.-Basel 2023;13(14):20

2023

DOI: 10.3390/app13148114 · Ref ID: 3420

Chatbots have gained widespread popularity for their task automation capabilities and consistent availability in various domains, including education. However, their ability to adapt to the continuously evolving and dynamic nature of knowledge is limited. This research investigates the implementation of an internet wizard to enhance the knowledge base of an open-domain question-answering chatbot. The proposed approach leverages search engines, particularly Google, and its features, including feature snippets, knowledge graph, and organic search, in conjunction with data science and natural language models. This mechanism empowers the chatbot to dynamically access the extensive and up-to-date knowledge available on the web, enabling the provision of real time and pertinent answers to user queries sourced from web documents. A pilot study in a higher education context evaluated the chatbot's mechanism and features, confirming its proficiency in generating responses across a broad range of educational and non-educational topics. Positive feedback and high user satisfaction validate these findings. Notably, the chatbot's dynamic feature of retrieving related or follow-up questions from search engines significantly enhances student engagement and facilitates exploration of supplementary information beyond the curriculum.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1741 - Na 2023
A Pre-training Method Inspired by Large Language Model for Power Named Entity Recognition

Na, Q.; Li, X.; Wang, Y.; Li, J.; Yang, Y.; Zhang, H.

ACM International Conference Proceeding Series 2023;():308-312

Association for Computing Machinery 2023

DOI: 10.1145/3653081.3653131 · Ref ID: 4712

In recent years, the field of natural language processing has witnessed remarkable advancements due to the success of large language models. These models leverage the Transformer architecture and pre-training techniques to achieve impressive results. In this paper, we draw inspiration from large language models and apply these techniques into the task of named entity recognition in the domain of power grids, which is critical for building power grid knowledge graphs and question-answering systems. Specifically, we propose a BERT-CNN-BIGRU-CRF deep learning model for named entity recognition. This model effectively harnesses the semantic modeling capabilities and pre-training knowledge of BERT, which is based on the Transformer architecture. By incorporating CNN and BIGRU, the model captures and models both local and global features, respectively. The CRF layer is employed for label classification. This combination of components ensures a high level of recognition accuracy. To evaluate the performance of the proposed model, we train our model on annotated maintenance plan data. We compare its results with those of other commonly used models. The evaluation metrics include recall, precision, and F1 score, which are widely employed in named entity recognition tasks. Our proposed model achieves optimal performance across all three metrics, demonstrating its superiority over other models. © 2023 ACM.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3860 - Nadkarni 2021
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study

Nadkarni, Rahul; Wadden, David; Beltagy, Iz; Smith, Noah A.; Hajishirzi, Hannaneh; Hope, Tom

arXiv 2021;():

2021

Ref ID: 7466

Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug design and repurposing. Recent work has shown that general-domain language models (LMs) can serve as "soft" KGs, and that they can be fine-tuned for the task of KG completion. In this work, we study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We evaluate several domain-specific LMs, fine-tuning them on datasets centered on drugs and diseases that we represent as KGs and enrich with textual entity descriptions. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance. Finally, we demonstrate the advantage of LM models in the inductive setting with novel scientific entities. Our datasets and code are made publicly available.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1248 - Nahed 2024
Enhancing Clinical Trial Summarization: Leveraging Large Language Models and Knowledge Graphs for Entity Preservation

Nahed, P.; Kambar, M. E. Z. N.; Taghva, K.

Lecture Notes in Networks and Systems 2024;1003 LNNS():325-336

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-981-97-3302-6_26 · Ref ID: 4434

ClinicalTrials.gov is an accessible online medical resource for researchers, healthcare professionals, and policy designers seeking detailed information on clinical trials. Summarizing these long clinical records can significantly reduce the time needed for the database users as the process transforms comprehensive information into concise synopses, preserving the essential meaning and facilitating understanding. In this paper, we employ the Bidirectional and Auto-Regressive Transformers model to generate the trials’ brief summaries. Our contributions provide new preprocessing techniques for model training, which leads to a robust summarization model. The fine-tuned model significantly enhanced ROUGE-1, ROUGE-2, and ROUGE-L F1-scores by 14%, 23%, and 20%, respectively, compared to previous studies. Additionally, we present an innovative knowledge graph based on entity classes to assess the generated summaries. This graph not only quantifies the essential entities transformed from the original text to the summaries but also provides insights into their specific order and arrangement in sentences. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2350 - Najjar 2005
DOKGETT - an authoring tool for cognitive model-based generation of the knowledge

Najjar, M.; Fournier-Viger, P.; Mayers, A.; Halle, J.

Fifth IEEE International Conference on Advanced Learning Technologies (ICALT'05) 2005;():371-375

2005

DOI: 10.1109/ICALT.2005.127 · Ref ID: 6209

In this paper we present an authoring tool milieu that permits modelling graphically any subject-matter domain knowledge and transposing it automatically into related XML files. Generated contents serve as a tutor reasoning support when interacting with students engaged in learning activities through virtual learning environments.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#365 - Naseem 2022
Incorporating Medical Knowledge to Transformer-based Language Models for Medical Dialogue Generation

Naseem, U.; Bandi, A.; Raza, S.; Rashid, J.; Chakravarthi, B. R.; Assoc Computat, Linguist

21st Workshop on Biomedical Language Processing (BioNLP) at the 60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():110-115

Dublin, IRELAND Assoc Computational Linguistics-Acl 2022

Ref ID: 3184

Medical dialogue systems have the potential to assist doctors in expanding access to medical care, improving the quality of patient experiences, and lowering medical expenses. The computational methods are still in their early stages and are not ready for widespread application despite their great potential. Existing transformer-based language models have shown promising results but lack domain-specific knowledge. However, to diagnose like doctors, an automatic medical diagnosis necessitates more stringent requirements for the rationality of the dialogue in the context of relevant knowledge. In this study, we propose a new method that addresses the challenges of medical dialogue generation by incorporating medical knowledge into transformer-based language models. We present a method that leverages an external medical knowledge graph and injects triples as domain knowledge into the utterances. Automatic and human evaluation on a publicly available dataset demonstrates that incorporating medical knowledge outperforms several state-of-the-art baseline methods.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1107 - Naveen 2024
Comparative Methods of Implementation for Different Question Answering Systems

Naveen, C.; Amutha, B.

Proceedings - 2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks, ICICV 2024 2024;():567-575

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICICV62344.2024.00096 · Ref ID: 4636

This research introduces an innovative Question Answering (QA) system tailored explicitly for government department inquiries regarding individuals. Harnessing the prowess of cutting-edge language models such as BERT and T5 (Text-to-Text Transfer Transformer), the system excels in understanding complex queries within diverse governmental domains. Moreover, it incorporates a specialized Knowledge Graph meticulously curated with interconnected information about people across various departments. By integrating BERT and T5 for versatile query comprehension and answer generation alongside a comprehensive People-centric Knowledge Graph, this system aims to revolutionize information retrieval within government entities. The seamless fusion of these technologies promises accurate, contextually rich responses, optimizing operational efficiency across government departments and fostering streamlined access to crucial information. © 2024 IEEE.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#378 - Nayyeri 2023
Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces

Nayyeri, M.; Wang, Z. H.; Akter, M. M.; Alam, M. M.; Rony, M. R. A.; Lehmann, J.; Staab, S.

22nd International Semantic Web Conference (ISWC) 2023;14265():388-407

Athens, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-47240-4_21 · Ref ID: 2930

Knowledge graphs comprise structural and textual information to represent knowledge. To predict new structural knowledge, current approaches learn representations using both types of information through knowledge graph embeddings and language models. These approaches commit to a single pre-trained language model. We hypothesize that heterogeneous language models may provide complementary information not exploited by current approaches. To investigate this hypothesis, we propose a unified framework that integrates multiple representations of structural knowledge and textual information. Our approach leverages hypercomplex algebra to model the interactions between (i) graph structural information and (ii) multiple text representations. Specifically, we utilize Dihedron models with 4*D dimensional hypercomplex numbers to integrate four different representations: structural knowledge graph embeddings, word-level representations (e.g., Word2vec and FastText), sentence-level representations (using a sentence transformer), and document-level representations (using FastText or Doc2vec). Our unified framework score the plausibility of labeled edges via Dihedron products, thus modeling pairwise interactions between the four representations. Extensive experimental evaluations on standard benchmark datasets confirm our hypothesis showing the superiority of our two new frameworks for link prediction tasks.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#3138 - Nayyeri 2023
Integrating Knowledge Graph Embeddings and&nbsp;Pre-trained Language Models in&nbsp;Hypercomplex Spaces

Nayyeri, Mojtaba; Wang, Zihao; Akter, Mst. Mahfuja; Alam, Mirza Mohtashim; Rony, Md Rashad Al Hasan; Lehmann, Jens; Staab, Steffen

The Semantic Web – ISWC 2023: 22nd International Semantic Web Conference, Athens, Greece, November 6–10, 2023, Proceedings, Part I 2023;():388–407

Athens, Greece Springer-Verlag 2023

DOI: 10.1007/978-3-031-47240-4_21 · Ref ID: 7152

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#7 - Nguyen 2021
Advanced Semantics for Commonsense Knowledge Extraction

Nguyen, T. P.; Razniewski, S.; Weikum, G.; Acm

30th World Wide Web Conference (WWW) 2021;():2636-2647

Electr Network Assoc Computing Machinery 2021

DOI: 10.1145/3442381.3449827 · Ref ID: 3760

Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent. A web interface, data and code can be found at https://www.mpi-inf.mpg.de/ascent.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2032 - Ni 2024
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation

Ni, S.; Bi, K.; Guo, J.; Cheng, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():11375-11388

Association for Computational Linguistics (ACL) 2024

Ref ID: 4253

Large Language Models (LLMs) have been found to have difficulty knowing they do not possess certain knowledge and tend to provide specious answers in such cases. Retrieval Augmentation (RA) has been extensively studied to mitigate LLMs' hallucinations. However, due to the extra overhead and unassured quality of retrieval, it may not be optimal to conduct RA all the time. A straightforward idea is to only conduct retrieval when LLMs are uncertain about a question. This motivates us to enhance the LLMs' ability to perceive their knowledge boundaries to help RA. In this paper, we first quantitatively measure LLMs' such ability and confirm their overconfidence. Then, we study how LLMs' certainty about a question correlates with their dependence on external retrieved information. We propose several methods to enhance LLMs' perception of knowledge boundaries and show that they are effective in reducing overconfidence. Additionally, equipped with these methods, LLMs can achieve comparable or even better performance of RA with much fewer retrieval calls. The code can be found at https://github.com/ShiyuNee/When-to-Retrieve. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1087 - Nighojkar 2022
Cognitive Modeling of Semantic Fluency Using Transformers

Nighojkar, A.; Khlyzova, A.; Licato, J.

CEUR Workshop Proceedings 2022;3251():

CEUR-WS 2022

Ref ID: 5472

Can deep language models be explanatory models of human cognition? If so, what are their limits? In order to explore this question, we propose an approach called hyperparameter hypothesization that uses predictive hyperparameter tuning in order to find individuating descriptors of cognitive-behavioral profiles. We take the first step in this approach by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science that has never before been modeled using transformerbased language models (TLMs). In our task setup, we compare several approaches to predicting which word an individual performing SFT will utter next. We report preliminary evidence suggesting that, despite obvious implementational differences in how people and TLMs learn and use language, TLMs can be used to identify individual differences in human fluency task behaviors better than existing computational models, and may offer insights into human memory retrieval strategies-cognitive process not typically considered to be the kinds of things TLMs can model. Finally, we discuss the implications of this work for cognitive modeling of knowledge representations. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3962 - Ning 2024
UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction

Ning, Yansong; Liu, Hao

arXiv 2024;():

2024

Ref ID: 8089

Urban knowledge graph has recently worked as an emerging building block to distill critical knowledge from multi-sourced urban data for diverse urban application scenarios. Despite its promising benefits, urban knowledge graph construction (UrbanKGC) still heavily relies on manual effort, hindering its potential advancement. This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction. Specifically, we first construct the knowledgeable instruction set for UrbanKGC tasks (such as relational triplet extraction and knowledge graph completion) via heterogeneity-aware and geospatial-infused instruction generation. Moreover, we propose a tool-augmented iterative trajectory refinement module to enhance and refine the trajectories distilled from GPT-4. Through hybrid instruction fine-tuning with augmented trajectories on Llama 2 and Llama 3 family, we obtain UrbanKGC agent family, consisting of UrbanKGent-7/8/13B version. We perform a comprehensive evaluation on two real-world datasets using both human and GPT-4 self-evaluation. The experimental results demonstrate that UrbanKGent family can not only significantly outperform 31 baselines in UrbanKGC tasks, but also surpass the state-of-the-art LLM, GPT-4, by more than 10% with approximately 20 times lower cost. Compared with the existing benchmark, the UrbanKGent family could help construct an UrbanKG with hundreds of times richer relationships using only one-fifth of the data. Our data and code are available at https://github.com/usail-hkust/UrbanKGent.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2028 - Niu 2024
WHAT DOES THE KNOWLEDGE NEURON THESIS HAVE TO DO WITH KNOWLEDGE?

Niu, J.; Liu, A.; Zhu, Z.; Penn, G.

12th International Conference on Learning Representations, ICLR 2024 2024;():

International Conference on Learning Representations, ICLR 2024

Ref ID: 4630

We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism underlying the ability of large language models to recall facts from a training corpus. This nascent thesis proposes that facts are recalled from the training corpus through the MLP weights in a manner resembling key-value memory, implying in effect that “knowledge” is stored in the network. Furthermore, by modifying the MLP modules, one can control the language model's generation of factual information. The plausibility of the KN thesis has been demonstrated by the success of KN-inspired model editing methods (Dai et al., 2022; Meng et al., 2022). We find that this thesis is, at best, an oversimplification. Not only have we found that we can edit the expression of certain linguistic phenomena using the same model editing methods but, through a more comprehensive evaluation, we have found that the KN thesis does not adequately explain the process of factual expression. While it is possible to argue that the MLP weights store complex patterns that are interpretable both syntactically and semantically, these patterns do not constitute “knowledge.” To gain a more comprehensive understanding of the knowledge representation process, we must look beyond the MLP weights and explore recent models' complex layer structures and attention mechanisms. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3714 - Niu 2024
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval

Niu, Mengjia; Li, Hao; Shi, Jie; Haddadi, Hamed; Mo, Fan

arXiv 2024;():

2024

Ref ID: 8288

Large language models (LLMs) have demonstrated remarkable capabilities across various domains, although their susceptibility to hallucination poses significant challenges for their deployment in critical areas such as healthcare. To address this issue, retrieving relevant facts from knowledge graphs (KGs) is considered a promising method. Existing KG-augmented approaches tend to be resource-intensive, requiring multiple rounds of retrieval and verification for each factoid, which impedes their application in real-world scenarios. In this study, we propose Self-Refinement-Enhanced Knowledge Graph Retrieval (Re-KGR) to augment the factuality of LLMs' responses with less retrieval efforts in the medical field. Our approach leverages the attribution of next-token predictive probability distributions across different tokens, and various model layers to primarily identify tokens with a high potential for hallucination, reducing verification rounds by refining knowledge triples associated with these tokens. Moreover, we rectify inaccurate content using retrieved knowledge in the post-processing stage, which improves the truthfulness of generated responses. Experimental results on a medical dataset demonstrate that our approach can enhance the factual capability of LLMs across various foundational models as evidenced by the highest scores on truthfulness.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#3510 - Nori 2023
Identification of Knowledge Neurons in Protein Language Models

Nori, Divya; Singireddy, Shivali; Have, Marina Ten

arXiv 2023;():

2023

Ref ID: 7994

Neural language models have become powerful tools for learning complex representations of entities in natural language processing tasks. However, their interpretability remains a significant challenge, particularly in domains like computational biology where trust in model predictions is crucial. In this work, we aim to enhance the interpretability of protein language models, specifically the state-of-the-art ESM model, by identifying and characterizing knowledge neurons - components that express understanding of key information. After fine-tuning the ESM model for the task of enzyme sequence classification, we compare two knowledge neuron selection methods that preserve a subset of neurons from the original model. The two methods, activation-based and integrated gradient-based selection, consistently outperform a random baseline. In particular, these methods show that there is a high density of knowledge neurons in the key vector prediction networks of self-attention modules. Given that key vectors specialize in understanding different features of input sequences, these knowledge neurons could capture knowledge of different enzyme sequence motifs. In the future, the types of knowledge captured by each neuron could be characterized.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#438 - Oduro-Afriyie 2023
Knowledge Graph Enabled Open-Domain Conversational Question Answering

Oduro-Afriyie, J.; Jamil, H.

15th International Conference on Flexible Query Answering Systems (FQAS) 2023;14113():63-76

European Soc Fuzzy Log & Technol, Mallorca, SPAIN Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-42935-4_6 · Ref ID: 3056

With the advent of natural language enabled applications, there has been a growing appetite for conversational question answering systems. This demand is being largely satisfied with the help of such powerful language models as Open AI's GPT models, Google's BERT, and BigScience's BLOOM. However, the astounding amount of training data and computing resources required to create such models is a huge challenge. Furthermore, for such systems, catering to multiple application domains typically requires the acquisition of even more training data. We discuss an alternative approach to the problem of open-domain conversational question answering by utilizing knowledge graphs to capture relevant information from a body of text in any domain. We achieve this by allowing the relations of the knowledge graphs to be drawn directly from the body of text being processed, rather than from a fixed ontology. By connecting this process with SPARQL queries generated from natural language questions, we demonstrate the foundations of an open-domain question answering system that requires no training and can switch domains flexibly and seamlessly.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#208 - Oduro-Afriyie 2023
Enabling the Informed Patient Paradigm with Secure and Personalized Medical Question Answering

Oduro-Afriyie, J.; Jamil, H. M.; Acm

14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB) 2023;():

Houston, TX Assoc Computing Machinery 2023

DOI: 10.1145/3584371.3613016 · Ref ID: 3018

Quality patient care is a complex and multifaceted problem requiring the integration of data from multiple sources. We propose Medicient, a knowledge-graph-based question answering system that processes heterogeneous data sources, including patient health records, drug databases, and medical literature, into a unified knowledge graph with zero training. The knowledge graph is then utilized to provide personalized recommendations for treatment or medication. The system leverages the power of large language models for question understanding and natural language response generation, while hiding sensitive patient information. We compare our system to a large language model (ChatGPT), which does not have access to patient health records, and show that our system provides better recommendations. This study contributes to a growing body of research on knowledge graphs and their applications in healthcare.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1710 - Oelen 2024
ORKG ASK: a Neuro-symbolic Scholarly Search and Exploration System

Oelen, A.; Jaradeh, M. Y.; Auer, S.

CEUR Workshop Proceedings 2024;3759():

CEUR-WS 2024

Ref ID: 4226

Purpose: Finding scholarly articles is a time-consuming and cumbersome activity, yet crucial for conducting science.Due to the growing number of scholarly articles, new scholarly search systems are needed to effectively assist researchers in finding relevant literature.Methodology: We take a neuro-symbolic approach to scholarly search and exploration by leveraging state-of-the-art components, including semantic search, Large Language Models (LLMs), and Knowledge Graphs (KGs).The semantic search component composes a set of relevant articles.From this set of articles, information is extracted and presented to the user.Findings: The presented system, called ORKG ASK (Assistant for Scientific Knowledge), provides a production-ready search and exploration system.Our preliminary evaluation indicates that our proposed approach is indeed suitable for the task of scholarly information retrieval.Value: With ORKG ASK, we present a next-generation scholarly search and exploration system and make it available online.Additionally, the system components are open source with a permissive license. © 2024 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1631 - Omar 2023
Measurement of ChatGPT Performance in Mapping Natural Language Speficaction into an Entity Relationship Diagram

Omar, M. A.

2023 IEEE 11th International Conference on Systems and Control, ICSC 2023 2023;():530-535

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICSC58660.2023.10449869 · Ref ID: 4961

This paper explores the entity relationship diagram, a popular conceptual model used to depict entities, attributes, and relationships graphically. To help with this, we use ChatGPT, a sophisticated language model based on the GPT architecture, which can translate natural language text into an entity relationship diagram. The paper details the process of evaluating how well ChatGPT can perform compared to other state-of-the-art approaches for entity and relationship extraction. Our experimental findings demonstrate the strong ability of ChatGPT to translate natural language text into entity relationship diagrams, which has potential applications for knowledge graph building, data integration, and database schema design. Moreover, it can aid in automating the extraction and organization of information from unstructured text data, thereby simplifying the study of complex systems. © 2023 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1511 - Omar 2021
A knowledge graph question-Answering platform trained independently of the graph

Omar, R.; Dhall, I.; Sheikh, N.; Mansour, E.

CEUR Workshop Proceedings 2021;2980():

CEUR-WS 2021

Ref ID: 5681

We will demonstrate KGQAn, a question-Answering platform trained independently of KGs. KGQAn transforms a question into semantically equivalent SPARQL queries via a novel three-phase strategy based on natural language models trained generally for understanding and leveraging short English text. Without preprocessing or annotated questions on KGs, KGQAn outperformed the existing systems in KG question answering by an improvement of at least 33% in F1-measure and 61% in precision. During the demo, the audience will experience KGQAn for question answering on real KGs of topics of interest to them, such as DBpedia and OpenCitations Graph, and review the generated SPARQL queries and answers. A demo video is available online. © 2021 CEUR-WS. All rights reserved.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#541 - Omeliyanenko 2020
LM4KG: Improving Common Sense Knowledge Graphs with Language Models

Omeliyanenko, J.; Zehe, A.; Hettinger, L.; Hotho, A.

19th International Semantic Web Conference (ISWC) 2020;12506():456-473

Athens, GREECE Springer International Publishing Ag 2020

DOI: 10.1007/978-3-030-62419-4_26 · Ref ID: 3117

Language Models (LMs) and Knowledge Graphs (KGs) are both active research areas inMachine Learning and SemanticWeb. While LMs have brought great improvements for many downstream tasks on their own, they are often combined with KGs providing additionally aggregated, well structured knowledge. Usually, this is done by leveraging KGs to improve LMs. But what happens if we turn this around and use LMs to improve KGs? In this paper, we propose a method enabling the use of the knowledge inherently encoded in LMs to automatically improve explicit knowledge represented in common sense KGs. Edges in these KGs represent relations between concepts, but the strength of the relations is often not clear. We propose to transform KG relations to natural language sentences, allowing us to utilize the information contained in large LMs to rate these sentences through a new perplexity-based measure, Refined Edge WEIGHTing (REWEIGHT). We test our scoring scheme REWEIGHT on the popular LM BERT to produce new weights for the edges in the well-known ConceptNet KG. By retrofitting existing word embeddings to our modified ConceptNet, we create ConceptNet NumBERTbatch embeddings and show that these outperform the original ConceptNet Numberbatch on multiple established semantic similarity datasets.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#87 - Omeliyanenko 2023
CapsKG: Enabling Continual Knowledge Integration in Language Models for Automatic Knowledge Graph Completion

Omeliyanenko, J.; Zehe, A.; Hotho, A.; Schlör, D.

22nd International Semantic Web Conference (ISWC) 2023;14265():618-636

Athens, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-47240-4_33 · Ref ID: 2932

Automated completion of knowledge graphs is a popular topic in the Semantic Web community that aims to automatically and continuously integrate new appearing knowledge into knowledge graphs using artificial intelligence. Recently, approaches that leverage implicit knowledge from language models for this task have shown promising results. However, by fine-tuning language models directly to the domain of knowledge graphs, models forget their original language representation and associated knowledge. An existing solution to address this issue is a trainable adapter, which is integrated into a frozen language model to extract the relevant knowledge without altering the model itself. However, this constrains the generalizability to the specific extraction task and by design requires new and independent adapters to be trained for new knowledge extraction tasks. This effectively prevents the model from benefiting from existing knowledge incorporated in previously trained adapters. In this paper, we propose to combine the benefits of adapters for knowledge graph completion with the idea of integrating capsules, introduced in the field of continual learning. This allows the continuous integration of knowledge into a joint model by sharing and reusing previously trained capsules. We find that our approach outperforms solutions using traditional adapters, while requiring notably fewer parameters for continuous knowledge integration. Moreover, we show that this architecture benefits significantly from knowledge sharing in low-resource situations, outperforming adapter-based models on the task of link prediction.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3209 - Oruganti 2023
Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs

Oruganti, Sanjay; Nirenburg, Sergei; English, Jesse; McShane, Marjorie

arXiv 2023;():

2023

Ref ID: 8014

The paper describes a system that uses large language model (LLM) technology to support the automatic learning of new entries in an intelligent agent's semantic lexicon. The process is bootstrapped by an existing non-toy lexicon and a natural language generator that converts formal, ontologically-grounded representations of meaning into natural language sentences. The learning method involves a sequence of LLM requests and includes an automatic quality control step. To date, this learning method has been applied to learning multiword expressions whose meanings are equivalent to those of transitive verbs in the agent's lexicon. The experiment demonstrates the benefits of a hybrid learning architecture that integrates knowledge-based methods and resources with both traditional data analytics and LLMs.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#3737 - Otal 2024
A New Perspective on ADHD Research: Knowledge Graph Construction with LLMs and Network Based Insights

Otal, Hakan T.; Faraone, Stephen V.; Canbaz, M. Abdullah

arXiv 2024;():

2024

Ref ID: 8608

Attention-Deficit/Hyperactivity Disorder (ADHD) is a challenging disorder to study due to its complex symptomatology and diverse contributing factors. To explore how we can gain deeper insights on this topic, we performed a network analysis on a comprehensive knowledge graph (KG) of ADHD, constructed by integrating scientific literature and clinical data with the help of cutting-edge large language models. The analysis, including k-core techniques, identified critical nodes and relationships that are central to understanding the disorder. Building on these findings, we curated a knowledge graph that is usable in a context-aware chatbot (Graph-RAG) with Large Language Models (LLMs), enabling accurate and informed interactions. Our knowledge graph not only advances the understanding of ADHD but also provides a powerful tool for research and clinical applications.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3454 - Pal 2024
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems &amp; Hallucinations

Pal, Ankit; Sankarasubbu, Malaikannan

arXiv 2024;():

2024

Ref ID: 8091

Large language models have the potential to be valuable in the healthcare industry, but it's crucial to verify their safety and effectiveness through rigorous evaluation. For this purpose, we comprehensively evaluated both open-source LLMs and Google's new multimodal LLM called Gemini across Medical reasoning, hallucination detection, and Medical Visual Question Answering tasks. While Gemini showed competence, it lagged behind state-of-the-art models like MedPaLM 2 and GPT-4 in diagnostic accuracy. Additionally, Gemini achieved an accuracy of 61.45% on the medical VQA dataset, significantly lower than GPT-4V's score of 88%. Our analysis revealed that Gemini is highly susceptible to hallucinations, overconfidence, and knowledge gaps, which indicate risks if deployed uncritically. We also performed a detailed analysis by medical subject and test type, providing actionable feedback for developers and clinicians. To mitigate risks, we applied prompting strategies that improved performance. Additionally, we facilitated future research and development by releasing a Python module for medical LLM evaluation and establishing a dedicated leaderboard on Hugging Face for medical domain LLMs. Python module can be found at https://github.com/promptslab/RosettaEval

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#1653 - Palma 2024
Modelling Interestingness: AWorkflow for Surprisal-based Knowledge Mining in Narrative Semantic Networks

Palma, C.

CEUR Workshop Proceedings 2024;3749():

CEUR-WS 2024

Ref ID: 4325

This working paper outlines ongoing and planned efforts aimed at achieving an objective modelling of interestingness in cross-domain knowledge bases. In pursuit of this objective, clickstream data serves as a primary component for developing a novel measure of entity-related popularity. This measure is then integrated with two couple-related similarity measures, culminating in the formulation of a new interestingness law. This principled formalization is designed to undergo human validation, ultimately enhancing its reliability and comprehensiveness. The present contribution is intended to be propaedeutic to the development of a pipeline having a Knowledge Graph as input, and an expanded version of the same as output, whereby every link is labelled by an interestingness score, thus highlighting the most interesting paths, determined according to the proposed domain-specific heuristics for interestingness detection. This work is expected to yield significant benefits for Automatic Story Generation. Although this discipline, aided by Machine Learning, has made remarkable progress in surface-level text realization, it still grapples with producing qualitatively rich outputs that offer substantive informativeness. To address this challenge, a Knowledge Graph (particularly its most compelling paths identified through the proposed methodology) is anticipated to integrate the Large Language Model, thus harnessing the final output with the contextual information selected by users throughout the entire workflow- a scenario which is particularly valuable in educational settings, where generated stories frequently serve pedagogical purposes. © 2024 Copyright for this paper by its authors.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3939 - Pan 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration

Pan, Kaihang; Fan, Zhaoyu; Li, Juncheng; Yu, Qifan; Fei, Hao; Tang, Siliang; Hong, Richang; Zhang, Hanwang; Sun, Qianru

arXiv 2024;():

2024

Ref ID: 8639

The swift advancement in Multimodal LLMs (MLLMs) also presents significant challenges for effective knowledge editing. Current methods, including intrinsic knowledge editing and external knowledge resorting, each possess strengths and weaknesses, struggling to balance the desired properties of reliability, generality, and locality when applied to MLLMs. In this paper, we propose UniKE, a novel multimodal editing method that establishes a unified perspective and paradigm for intrinsic knowledge editing and external knowledge resorting. Both types of knowledge are conceptualized as vectorized key-value memories, with the corresponding editing processes resembling the assimilation and accommodation phases of human cognition, conducted at the same semantic levels. Within such a unified framework, we further promote knowledge collaboration by disentangling the knowledge representations into the semantic and truthfulness spaces. Extensive experiments validate the effectiveness of our method, which ensures that the post-edit MLLM simultaneously maintains excellent reliability, generality, and locality. The code for UniKE will be available at \url{https://github.com/beepkh/UniKE}.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1189 - Pan 2023
Differentiable Rule Extraction with Large Language Model for Knowledge Graph Reasoning

Pan, Y.; Zhang, L.; Cai, Z.; Zhao, T.; Wei, B.; Liu, J.

J. Frontier. Comput. Sci. Technol. 2023;17(10):2403-2412

2023

DOI: 10.3778/j.issn.1673-9418.2306049 · Ref ID: 4808

Knowledge graph (KG) reasoning is to predict missing entities or relationships in incomplete triples, complete structured knowledge, and apply to different downstream tasks. Different from black-box methods which are widely studied, such as methods based on representation learning, the method based on rule extraction achieves an interpretable reasoning paradigm by generalizing first-order logic rules from the KG. To address the gap between discrete symbolic space and continuous embedding space, a differentiable rule extracting method based on the large pre-trained language model (DRaM) is proposed, which integrates discrete first-order logical rules with continuous vector space. In view of the influence of atom sequences in first-order logic rules for the reasoning process, a large pre-trained language model is introduced to encode the reasoning process. The differentiable method DRaM, which integrates first-order logical rules, achieves good results in link prediction tasks on three knowledge graph datasets, Family, Kinship and UMLS, especially for the indicator Hits@10. Comprehensive experimental results show that DRaM can effectively solve the problems of differentiable reasoning on the KGs, and can extract first-order logic rules with confidences from the reasoning process. DRaM not only enhances the reasoning performance with the help of first-order logic rules, but also enhances the interpretability of the method. © 2023 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1378 - Panda 2024
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs

Panda, P.; Agarwal, A.; Devaguptapu, C.; Kaul, M.; Prathosh, A. P.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():13263-13282

Association for Computational Linguistics (ACL) 2024

Ref ID: 4305

Given unstructured text, Large Language Models (LLMs) are adept at answering simple (single-hop) questions. However, as the complexity of the questions increase, the performance of LLMs degrade. We believe this is due to the overhead associated with understanding the complex question followed by filtering and aggregating unstructured information in the raw text. Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text, aiming to provide a structured overview that simplifies information processing. However, this simplistic approach is query-agnostic and the extracted facts are ambiguous as they lack context. To address these drawbacks and to enable LLMs to answer complex (multi-hop) questions with ease, we propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information. The use of our compressed distilled KG as input to the LLM results in our method utilizing up to 67% fewer tokens to represent the query relevant information present in the supporting documents, compared to the state-of-the-art (SoTA) method. Our experiments show consistent improvements over the SoTA across several metrics (EM, F1, BERTScore, and Human Eval) on two popular benchmark datasets (HotpotQA and MuSiQue). © 2024 Association for Computational Linguistics.

Srividya voted
yuexi voted
Final decision
What was the agreed final decision?

#2043 - Papaluca 2024
Zero- and Few-Shots Knowledge Graph Triplet Extraction with Large Language Models

Papaluca, A.; Krefl, D.; Rodríguez Méndez, S. J.; Lensky, A.; Suominen, H.

KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():12-23

Association for Computational Linguistics (ACL) 2024

Ref ID: 4366

In this work, we tested the Triplet Extraction (TE) capabilities of a variety of Large Language Models (LLMs) of different sizes in the Zero- and Few-Shots settings. In detail, we proposed a pipeline that dynamically gathers contextual information from a Knowledge Base (KB), both in the form of context triplets and of (sentence, triplets) pairs as examples, and provides it to the LLM through a prompt. The additional context allowed the LLMs to be competitive with all the older fully trained baselines based on the Bidirectional Long Short-Term Memory (BiLSTM) Network architecture. We further conducted a detailed analysis of the quality of the gathered KB context, finding it to be strongly correlated with the final TE performance of the model. In contrast, the size of the model appeared to only logarithmically improve the TE capabilities of the LLMs. We release the code on GitHub 1 for reproducibility. ©2024 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2896 - Paraiso-Medina 2015
Semantic Normalization and Query Abstraction Based on SNOMED-CT and HL7: Supporting Multicentric Clinical Trials

Paraiso-Medina, S.; Perez-Rey, D.; Bucur, A.; Claerhout, B.; Alonso-Calvo, R.

IEEE Journal of Biomedical and Health Informatics 2015;19(3):1061-1067

2015

DOI: 10.1109/JBHI.2014.2357025 · Ref ID: 6480

Advances in the use of omic data and other biomarkers are increasing the number of variables in clinical research. Additional data have stratified the population of patients and require that current studies be performed among multiple institutions. Semantic interoperability and standardized data representation are a crucial task in the management of modern clinical trials. In the past few years, different efforts have focused on integrating biomedical information. Due to the complexity of this domain and the specific requirements of clinical research, the majority of data integration tasks are still performed manually. This paper presents a semantic normalization process and a query abstraction mechanism to facilitate data integration and retrieval. A process based on well-established standards from the biomedical domain and the latest semantic web technologies has been developed. Methods proposed in this paper have been tested within the EURECA EU research project, where clinical scenarios require the extraction of semantic knowledge from biomedical vocabularies. The aim of this paper is to provide a novel method to abstract from the data model and query syntax. The proposed approach has been compared with other initiatives in the field by storing the same dataset with each of those solutions. Results show an extended functionality and query capabilities at the cost of slightly worse performance in query execution. Implementations in real settings have shown that following this approach, usable interfaces can be developed to exploit clinical trial data outcomes.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2119 - Paraschiv 2015
Analyzing the Semantic Relatedness of Paper Abstracts: An Application to the Educational Research Field

Paraschiv, I. C.; Dascalu, M.; Trausan-Matu, S.; Dessus, P.

2015 20th International Conference on Control Systems and Computer Science 2015;():759-764

2015

DOI: 10.1109/CSCS.2015.146 · Ref ID: 6145

Each domain, along with its knowledge base, changes over time and every timeframe is centered on specific topics that emerge from different ongoing research projects. As searching for relevant resources is a time-consuming process, the automatic extraction of the most important and relevant articles from a domain becomes essential in supporting researchers in their day-to-day activities. The proposed analysis extends other previous researches focused on extracting co-citations between the papers, with the purpose of comparing their overall importance within the domain from a semantic perspective. Our method focuses on the semantic analysis of paper abstracts by using Natural Language Processing (NLP) techniques such as Latent Semantic Analysis, Latent Dirichlet Allocation or specific ontology distances, i.e., Word Net. Moreover, the defined mechanisms are enforced on two different sub domains from the corpora generated around the keywords "e-learning" and "computer". Graph visual representations are used to highlight the keywords of each sub domain, links among concepts and between articles, as well as specific document similarity views, or scores reflecting the keyword-abstract overlaps. In the end, conclusions and future improvements are presented, emphasizing nevertheless the key elements of our research support framework.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#744 - Park 2023
Selective UMLS knowledge infusion for biomedical question answering

Park, H.; Son, J.; Min, J.; Choi, J.

Sci Rep 2023;13(1):9

2023

DOI: 10.1038/s41598-023-41423-8 · Ref ID: 3102

One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeting biomedical question-answering. Transferring all semantics of a large knowledge graph into the entire model requires too many parameters, increasing computational cost and time. We investigate an efficient approach that leverages adapters to inject Unified Medical Language System knowledge into pretrained language models, and we question the need to use all semantics in the knowledge graph. This study focuses on strategies of partitioning knowledge graph and either discarding or merging some for more efficient pretraining. According to the results of three biomedical question answering finetuning datasets, the adapters pretrained on semantically partitioned group showed more efficient performance in terms of evaluation metrics, required parameters, and time. The results also show that discarding groups with fewer concepts is a better direction for small datasets, and merging these groups is better for large dataset. Furthermore, the metric results show a slight improvement, demonstrating that the adapter methodology is rather insensitive to the group formulation.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3465 - Park 2024
Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation

Park, Jinyoung; Joo, Minseok; Kim, Joo-Kyung; Kim, Hyunwoo J.

arXiv 2024;():

2024

Ref ID: 8696

Knowledge graph-grounded dialog generation requires retrieving a dialog-relevant subgraph from the given knowledge base graph and integrating it with the dialog history. Previous works typically represent the graph using an external encoder, such as graph neural networks, and retrieve relevant triplets based on the similarity between single-vector representations of triplets and the dialog history. However, these external encoders fail to leverage the rich knowledge of pretrained language models, and the retrieval process is also suboptimal due to the information bottleneck caused by the single-vector abstraction of the dialog history. In this work, we propose Dialog generation with Generative Subgraph Retrieval (DialogGSR), which retrieves relevant knowledge subgraphs by directly generating their token sequences on top of language models. For effective generative subgraph retrieval, we introduce two key methods: (i) structure-aware knowledge graph linearization with self-supervised graph-specific tokens and (ii) graph-constrained decoding utilizing graph structural proximity-based entity informativeness scores for valid and relevant generative retrieval. DialogGSR achieves state-of-the-art performance in knowledge graph-grounded dialog generation, as demonstrated on OpenDialKG and KOMODIS datasets.

Xinchen voted
Mike voted
Final decision
What was the agreed final decision?

#120 - Parolin 2021
CoMe-KE: A New Transformers Based Approach for Knowledge Extraction in Conflict and Mediation Domain

Parolin, E. S.; Hu, Y. B.; Khan, L.; Osorio, J.; Brandt, P. T.; D'Orazio, V.

9th IEEE International Conference on Big Data (IEEE BigData) 2021;():1449-1459

Electr Network Ieee 2021

DOI: 10.1109/BigData52589.2021.9672080 · Ref ID: 3776

Knowledge discovery and extraction approaches attract special attention across industries and areas moving toward the 5V Era. In the political and social sciences, scholars and governments dedicate considerable resources to develop intelligent systems for monitoring, analyzing and predicting conflicts and affairs involving political entities across the globe. Such systems rely on background knowledge from external knowledge bases, that conflict experts commonly maintain manually. The high costs and extensive human efforts associated with updating and extending these repositories often compromise their correctness of. Here we introduce CoMe-KE (Conflict and Mediation Knowledge Extractor) to extend automatically knowledge bases about conflict and mediation events. We explore state-of-the-art natural language models to discover new political entities, their roles and status from news. We propose a distant supervised method and propose an innovative zero-shot approach based on a dynamic hypothesis procedure. Our methods leverage pre-trained models through transfer learning techniques to obtain excellent results with no need for a labeled data. Finally, we demonstrate the superiority of our method through a comprehensive set of experiments involving two study cases in the social sciences domain. CoMe-KE significantly outperforms the existing baseline, with (on average) double of the performance retrieving new political entities.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3140 - Patel 2020
Jointly Learning Knowledge Graph Embeddings, Fine Grain Entity Types and Language Models

Patel, Rajat Hareshkumar; Finin, Tim; Joshi, Karuna

2020;():

University of Maryland, Baltimore County 2020

Ref ID: 7196

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#2215 - Paula 2015
Building Up Conceptual Spaces: An ESOM Supported Strategy

Paula, S. M. De; Carvalho, M. C.; Gudwin, R. R.

2015 Brazilian Conference on Intelligent Systems (BRACIS) 2015;():122-127

2015

DOI: 10.1109/BRACIS.2015.63 · Ref ID: 6728

Intelligent agents need robust knowledge representation schemes to model and solve complex real-world problems. A historical approach is the symbolic representation proposed in classic AI. Although symbolic representations have their appeal, the use of abstract symbols, representing general knowledge about the world, brings limitations to the way agents develop certain cognitive functions, as in the case of language. In the standard symbolic approach, there is no ground for the symbols used internally by the agents, creating a situation known as the symbol grounding problem, as explained by Harnad (1990). To deal with this problem, Gardenfors (2004) introduced a semantic theory named conceptual spaces, which attribute meaning to linguistic symbols. The geometry of such spaces forms a robust structure to conceptualize information. In this paper, we use an unsupervised classifier named Evolving Self-Organizing Maps (ESOM) to act as the computational implementation of conceptual spaces. Our results confirmed ESOM's capability to create concepts, aiding agents in reaching a linguistic consensus about different words exchanged during an objects naming game. Besides providing a way for symbols to get meaning on a biologically realistic way, these results also open possibilities for other characteristics of conceptual spaces to be applied on the study of artificial language, as e.g. Grammatical language.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#382 - Payumo 2024
Intelligent Knowledge Base Search Tool using Large Language Model and Graph Neural Network

Payumo, K.; Subramanian, I.; Lu, T.; Chow, E.

Conference on Pattern Recognition and Prediction XXXV 2024;13040():

National Harbor, MD Spie-Int Soc Optical Engineering 2024

DOI: 10.1117/12.3014075 · Ref ID: 3359

Within many organizations, a vast number of communications, memos, reports and documents have been accumulated in internal servers. Efficiently discovering relevant entries can reduce time spent addressing organizational needs such as personnel skills matching or anomaly resolution. However, per organization, information retrieval on these disparate data types can be challenging, as systems must be designed for their domain while accounting for unstructured and inconsistent datasets. Traditional querying via search terms often requires relevancy tuning by subject matter experts which makes it difficult to build retrieval systems. We argue that development of retrieval systems can be simplified and enhanced by embedding data with Large Language Models (LLMs), organizing information in a Knowledge Graph (KG) structure, and further encoding their relational features through a Graph Neural Network (GNN). One of the major challenges of using GNNs for information retrieval is optimizing negative edge selection. Training GNNs requires a balanced ratio between positive and negative edges however the space of negative edges is exponentially larger than positive edges. In this work, we extend the LLM-GNN hybrid architecture by applying ensemble voting on a set of trained LLM-GNNs. Preliminary results have shown modest improvement on our personnel-document matching tasks. This work contributes to a developmental effort that aims to help engineers and scientists find new research opportunities, learn from past mistakes, and quickly address future needs.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1433 - Paz-Argaman 2024
Into the Unknown: Generating Geospatial Descriptions for New Environments

Paz-Argaman, T.; Palowitch, J.; Kulkarni, S.; Tsarfaty, R.; Baldridge, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():2259-2273

Association for Computational Linguistics (ACL) 2024

Ref ID: 4214

Similar to vision-and-language navigation (VLN) tasks that focus on bridging the gap between vision and language for embodied navigation, the new Rendezvous (RVS) task requires reasoning over allocentric spatial relationships (independent of the observer's viewpoint) using non-sequential navigation instructions and maps. However, performance substantially drops in new environments with no training data. Using opensource descriptions paired with coordinates (e.g., Wikipedia) provides training data but suffers from limited spatially-oriented text resulting in low geolocation resolution. We propose a large-scale augmentation method for generating high-quality synthetic data for new environments using readily available geospatial data. Our method constructs a grounded knowledge-graph, capturing entity relationships. Sampled entities and relations (“shop north of school”) generate navigation instructions via (i) generating numerous templates using context-free grammar (CFG) to embed specific entities and relations; (ii) feeding the entities and relation into a large language model (LLM) for instruction generation. A comprehensive evaluation on RVS, showed that our approach improves the 100-meter accuracy by 45.83% on unseen environments. Furthermore, we demonstrate that models trained with CFG-based augmentation achieve superior performance compared with those trained with LLM-based augmentation, both in unseen and seen environments. These findings suggest that the potential advantages of explicitly structuring spatial information for text-based geospatial reasoning in previously unknown, can unlock data-scarce scenarios. © 2024 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1811 - Pei 2024
Research on Public Security Professional Small Sample Knowledge Extraction Method Based on Large Language Model

Pei, B.; Li, X.; Jiang, Z.; Liu, M.

J. Frontier. Comput. Sci. Technol. 2024;18(10):2630-2642

2024

DOI: 10.3778/j.issn.1673-9418.2403039 · Ref ID: 3897

The rapid development of informatization and digitalization in public security business has generated a large amount of law enforcement case data in public security work. However, due to various types of text and large amount of information, front-line police officers often face problems such as low reading efficiency and difficulty in aggregating information in the process of reading case files. In order to further utilize the law enforcement case text, it is necessary to conduct intelligent analysis and knowledge extraction. However, due to the professionalism, data sensitivity, confidentiality of public security professional law enforcement case text, as well as the requirements of public security data going out of the network, only a small number of learning training samples can be obtained, and the traditional deep learning model has unsatisfactory extraction effect. Therefore, this paper proposes to build a large language model in vertical fields with fewer resources and data, and realize the adaptation of the model to the public security profession. The model uses knowledge editing technology MEMIT (mess-editing memory in a transformer), low-resource fine-tuning technology LoRA (low-rank adaptation), and prompt templates to improve the model’s understanding of public security knowledge such as police terminology and common sense. Moreover, in order to further improve the knowledge extraction effect of the model, a small sample law enforcement case text data extraction process is designed to better integrate the professional knowledge related to the case in the model. Experimental results show that the accuracy of the public security professional vertical field large language model integrated with the extraction process in various knowledge extraction tasks is significantly improved compared with the traditional methods, which helps front-line police officers quickly, objectively and accurately analyze law enforcement case text, dig out potential case information, and support the intelligent development of public security work. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1143 - Peng 2022
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models

Peng, H.; Wang, X.; Hu, S.; Jin, H.; Hou, L.; Li, J.; Liu, Z.; Liu, Q.

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5015-5035

Association for Computational Linguistics (ACL) 2022

Ref ID: 5410

Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark. Extensive experiments on different sizes and types of PLMs show that existing PLMs systematically lack conceptual knowledge and suffer from various spurious correlations. We believe this is a critical bottleneck for realizing human-like cognition in PLMs. COPEN and our codes are publicly released at https://github.com/THU-KEG/COPEN. © 2022 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1171 - Peng 2024
Deja vu: Contrastive Historical Modeling with Prefix-tuning for Temporal Knowledge Graph Reasoning

Peng, M.; Liu, B.; Xu, W.; Jiang, Z.; Zhu, J.

Findings of the Association for Computational Linguistics: NAACL 2024 - Findings 2024;():1178-1191

Association for Computational Linguistics (ACL) 2024

Ref ID: 4603

Temporal Knowledge Graph Reasoning (TKGR) is the task of inferring missing facts for incomplete TKGs in complex scenarios (e.g., transductive and inductive settings), which has been gaining increasing attention. Recently, to mitigate dependence on structured connections in TKGs, text-based methods have been developed to utilize rich linguistic information from entity descriptions. However, suffering from the enormous parameters and inflexibility of pre-trained language models, existing text-based methods struggle to balance the textual knowledge and temporal information with computationally expensive purpose-built training strategies. To tap the potential of text-based models for TKGR in various complex scenarios, we propose ChapTER, a Contrastive historical modeling framework with prefix-tuning for TEmporal Reasoning. ChapTER feeds history-contextualized text into the pseudo-Siamese encoders to strike a textual-temporal balance via contrastive estimation between queries and candidates. By introducing virtual time prefix tokens, it applies a prefix-based tuning method to facilitate the frozen PLM capable for TKGR tasks under different settings. We evaluate ChapTER on four transductive and three few-shot inductive TKGR benchmarks, and experimental results demonstrate that ChapTER achieves superior performance compared to competitive baselines with only 0.17% tuned parameters. We conduct thorough analysis to verify the effectiveness, flexibility and efficiency of ChapTER. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1705 - Peng 2023
Ontology Matching using Textual Class Descriptions

Peng, Y.; Alam, M.; Bonald, T.

CEUR Workshop Proceedings 2023;3591():67-72

CEUR-WS 2023

Ref ID: 5079

In this paper, we propose TEXTO, a TEXT-based Ontology matching system. This matcher leverages the rich semantic information of classes available in most ontologies by a combination of a pre-trained word embedding model and a pre-trained language model. Its performance is evaluated on the datasets of the OAEI Common Knowledge Graphs Track, augmented with the description of each class, and a new dataset based on the refreshed alignment of Schema.org and Wikidata. Our results demonstrate that TEXTO outperforms all state-of-art matchers in terms of precision, recall and F1 score. In particular, we show that almost perfect class alignment can be achieved using textual content only, excluding any structural information like the graph of classes or the instances of each class. © 2023 Copyright for this paper by its authors.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3642 - Peng 2024
Learning Rules from KGs Guided by Language Models

Peng, Zihang; Stepanova, Daria; Ho, Vinh Thinh; Adel, Heike; Russo, Alessandra; Ott, Simon

arXiv 2024;():

2024

Ref ID: 8594

Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems.

Kwesi voted
Srividya voted
Final decision
What was the agreed final decision?

#493 - Perevalov 2024
Language Models as SPARQL Query Filtering for Improving the Quality of Multilingual Question Answering over Knowledge Graphs

Perevalov, A.; Gashkov, A.; Eltsova, M.; Both, A.

24th International Conference Web Engineering (ICWE) 2024;14629():3-18

Tampere, FINLAND Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-62362-2_1 · Ref ID: 3152

Question Answering systems working over Knowledge Graphs (KGQA) generate a ranked list of SPARQL query candidates for a given natural-language question. In this paper, we follow our long-term research agenda of providing trustworthy KGQA systems - here - by presenting a query filtering approach that utilizes (large) language models (LMs/LLMs), s.t., correct and incorrect queries can be distinguished. In contrast to the previous work, we address here multilingual questions represented in major languages (English, German, French, Spanish, and Russian), and confirm the generalizability of our approach by also evaluating it on low-resource languages (Ukrainian, Armenian, Lithuanian, Belarusian, and Bashkir). For our experiments, we used the following LMs: BERT, DistilBERT, Mistral, Zephyr, GPT-3.5, and GPT-4. The LMs were applied to the KGQA systems - QAnswer and MemQA - as SPARQL query filters. The approach was evaluated on the multilingual Wikidata-based dataset QALD-9-plus. The experimental results suggest that the KGQA systems achieve quality improvements for all languages when using our query-filtering approach.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1117 - Peroni 2014
Conclusions

Peroni, S.

Law. Gov. Technol. Ser. 2014;15():257-262

Springer Science and Business Media B.V. 2014

DOI: 10.1007/978-3-319-04777-5_7 · Ref ID: 5805

In this chapter, I conclude the discussion of my work on Semantic Publishing. In particular, I summarise my own personal contributions in order to address one of the main issues of this field, i.e., the linking of a text to the formal representation of its meaning and thus the representation of its structure and of its argumentative discourse. In addition, I summarise my own contribution on the development of interfaces to hide the complexity of markup and ontology formalisms behind user-friendly views in order to help users of Semantic Publishing (e.g., scholars, publishers, archivists, librarians, etc.) that may have difficulties in interacting with Semantic Publishing technologies. Finally, I conclude the chapter introducing planned future works for all the languages, models and tools presented. © 2014, Springer International Publishing Switzerland.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#988 - Pertsas 2024
An Annotated Dataset for Transformer-based Scholarly Information Extraction and Linguistic Linked Data Generation

Pertsas, V.; Kasapaki, M.; Constantopoulos, P.

9th Workshop on Linked Data in Linguistics: Resources, Applications, Best Practices, LDL 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;():84-93

European Language Resources Association (ELRA) 2024

Ref ID: 4635

We present a manually curated and annotated, multidisciplinary dataset of 15,262 sentences from research articles (abstract and main text) that can be used for transformer-based extraction from scholarly publications of three types of entities: 1) research methods, named entities of variable length, 2) research goals, entities that appear as textual spans of variable length with mostly fixed lexico-syntactic-structure, and 3) research activities, entities that appear as textual spans of variable length with complex lexico-syntactic structure. We explore the capabilities of our dataset by using it for training/fine-tuning various ML and transformer-based models. We compare our finetuned models as well as LLM responses (chat-GPT 3.5) based on 10-shot learning, by measuring F1 scores in token-based, entity-based strict and entity-based partial evaluations across interdisciplinary and discipline-specific datasets in order to capture any possible differences in discipline-oriented writing styles. Results show that fine tuning of transformer-based models significantly outperforms the performance of few-shot learning of LLMs such as chat-GPT, highlighting the significance of annotation datasets in such tasks. Our dataset can also be used as a source for linguistic linked data by itself. We demonstrate this by presenting indicative queries in SPARQL, executed over such an RDF knowledge graph. © 2024 ELRA Language Resource Association.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#1676 - Phatak 2024
Narrating Causal Graphs with Large Language Models

Phatak, A.; Mago, V. K.; Agrawal, A.; Giabbanelli, P. J.

Proceedings of the Annual Hawaii International Conference on System Sciences 2024;():7530-7539

IEEE Computer Society 2024

Ref ID: 4795

The use of generative AI to create text descriptions from graphs has mostly focused on knowledge graphs, which connect concepts using facts. In this work we explore the capability of large pretrained language models to generate text from causal graphs, where salient concepts are represented as nodes and causality is represented via directed, typed edges. The causal reasoning encoded in these graphs can support applications as diverse as healthcare or marketing. Using two publicly available causal graph datasets, we empirically investigate the performance of four GPT-3 models under various settings. Our results indicate that while causal text descriptions improve with training data, compared to fact-based graphs, they are harder to generate under zero-shot settings. Results further suggest that users of generative AI can deploy future applications faster since similar performances are obtained when training a model with only a few examples as compared to fine-tuning via a large curated dataset. © 2024 IEEE Computer Society. All rights reserved.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3701 - Piantadosi 2022
Meaning without reference in large language models

Piantadosi, Steven T.; Hill, Felix

arXiv 2022;():

2022

Ref ID: 7570

The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings. Contrary to claims that LLMs possess no meaning whatsoever, we argue that they likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from conceptual role. Because conceptual role is defined by the relationships between internal representational states, meaning cannot be determined from a model's architecture, training data, or objective function, but only by examination of how its internal states relate to each other. This approach may clarify why and how LLMs are so successful and suggest how they can be made more human-like.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#883 - Piat 2023
What does KnowBert-UMLS forget?

Piat, G.; Semmar, N.; Tourille, J.; Allauzen, A.; Essafi, H.; Ieee

20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA) 2023;():

Giza, EGYPT Ieee 2023

DOI: 10.1109/aiccsa59173.2023.10479333 · Ref ID: 3060

Integrating a source of structured prior knowledge, such as a knowledge graph, into transformer-based language models is an increasingly popular method for increasing data efficiency and adapting them to a target domain. However, most methods for integrating structured knowledge into language models require additional training in order to adapt the model to the non-textual modality. This process typically leads to some amount of catastrophic forgetting on the general domain. KnowBert is one such knowledge integration method which can incorporate information from a variety of knowledge graphs to enhance the capabilities of transformer-based language models such as BERT. We conduct a qualitative analysis of the results of KnowBert-UMLS, a biomedically specialized KnowBert model, on a variety of linguistic tasks. Our results reveal that its increased understanding of biomedical concepts comes at the cost, specifically, of general common-sense knowledge and understanding of casual speech.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1358 - Plenz 2024
Graph Language Models

Plenz, M.; Frank, A.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4477-4494

Association for Computational Linguistics (ACL) 2024

Ref ID: 4399

While Language Models (LMs) are the workhorses of NLP, their interplay with structured knowledge graphs (KGs) is still actively researched. Current methods for encoding such graphs typically either (i) linearize them for embedding with LMs - which underutilize structural information, or (ii) use Graph Neural Networks (GNNs) to preserve the graph structure - but GNNs cannot represent text features as well as pretrained LMs. In our work we introduce a novel LM type, the Graph Language Model (GLM), that integrates the strengths of both approaches and mitigates their weaknesses. The GLM parameters are initialized from a pretrained LM to enhance understanding of individual graph concepts and triplets. Simultaneously, we design the GLM's architecture to incorporate graph biases, thereby promoting effective knowledge distribution within the graph. This enables GLMs to process graphs, texts, and interleaved inputs of both. Empirical evaluations on relation classification tasks show that GLM embeddings surpass both LM- and GNN-based baselines in supervised and zero-shot setting, demonstrating their versatility. © 2024 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2298 - Pol 2023
A Data-Driven Approach for Modeling Unknown Multi-Scale Systems

Pol, M.; Diaconescu, A.

2023 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C) 2023;():35-40

2023

DOI: 10.1109/ACSOS-C58168.2023.00033 · Ref ID: 6501

Complex adaptive systems often organize via multiple abstraction levels, or ‘scales’, interconnected by feedback loops. This enables adaptation and survival in changing environments, while managing complexity with limited resources. For an external observer unaware of such multi-scale structure, modeling an unknown system may be a complicated endeavor. This position paper proposes a data-driven approach for addressing this issue. It generates multi-scale models from incomplete monitoring data, capitalizing on the behavioral regularities that stem from its feedback loops. It also defines the appropriate language elements for expressing these multi-scale models. We validate our approach on data obtained from a theoretical multi-scale system: a holonic cellular automata (HCA) simulator. Results show that the proposed approach can identify the HCA's three abstraction levels and main modeling concepts. This is an encouraging first step towards establishing automatic methods for multi-scale model discovery from partial observations.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3244 - Porada 2019
Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text

Porada, Ian; Suleman, Kaheer; Cheung, Jackie Chi Kit

arXiv 2019;():

2019

Ref ID: 7382

Modeling semantic plausibility requires commonsense knowledge about the world and has been used as a testbed for exploring various knowledge representations. Previous work has focused specifically on modeling physical plausibility and shown that distributional methods fail when tested in a supervised setting. At the same time, distributional models, namely large pretrained language models, have led to improved results for many natural language understanding tasks. In this work, we show that these pretrained language models are in fact effective at modeling physical plausibility in the supervised setting. We therefore present the more difficult problem of learning to model physical plausibility directly from text. We create a training set by extracting attested events from a large corpus, and we provide a baseline for training on these attested events in a self-supervised manner and testing on a physical plausibility task. We believe results could be further improved by injecting explicit commonsense knowledge into a distributional model.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#823 - Postiglione 2021
Towards an Italian Healthcare Knowledge Graph

Postiglione, M.

14th International Conference on Similarity Search and Applications (SISA) 2021;13058():387-394

TU Dortmund, ELECTR NETWORK Springer International Publishing Ag 2021

DOI: 10.1007/978-3-030-89657-7_29 · Ref ID: 3086

Electronic Health Records (EHRs), Big Data, Knowledge Graphs (KGs) and machine learning can potentially be a great step towards the technological shift from the one size fit all medicine, where treatments are based on an equal protocol for all the patients, to the precision medicine, which takes count of all their individual information: lifestyle, preferences, health history, genomics, and so on. However, the lack of data which characterizes low-resource languages is a huge limitation for the application of the above-mentioned technologies. In this work, we will try to fill this gap by means of transformer language models and few-shot approaches and we will apply similarity-based deep learning techniques on the constructed KG for downstream applications. The proposed architecture is general and thus applicable to any low-resource language.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#2632 - Potts 2022
Leveraging Multiple Representations of Topic Models for Knowledge Discovery

Potts, C. M.; Savaliya, A.; Jhala, A.

IEEE Access 2022;10():104696-104705

2022

DOI: 10.1109/ACCESS.2022.3210529 · Ref ID: 6184

Topic models are often useful in categorization of related documents in information retrieval and knowledge discovery systems, especially for large datasets. Interpreting the output of these models remains an ongoing challenge for the research community. The typical practice in the application of topic models is to tune the parameters of a chosen model for a target dataset and select the model with the best output based on a given metric. We present a novel perspective on topic analysis by presenting a process for combining output from multiple models with different theoretical underpinnings. We show that this results in our ability to tackle novel tasks such as semantic characterization of content that cannot be carried out by using single models. One example task is to characterize the differences between topics or documents in terms of their purpose and also importance with respect to the underlying output of the discovery algorithm. To show the potential benefit of leveraging multiple models we present an algorithm to map the term-space of Latent Dirichlet Allocation (LDA) to the neural document-embedding space of doc2vec. We also show that by utilizing both models in parallel and analyzing the resulting document distributions using the Normalized Pointwise Mutual Information (NPMI) metric we can gain insight into the purpose and importance of topics across models. This approach moves beyond topic identification to a richer characterization of the information and provides a better understanding of the complex relationships between these typically competing techniques.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#549 - Pouramini 2024
Matching tasks to objectives: Fine-tuning and prompt-tuning strategies for encoder-decoder pre-trained language models

Pouramini, A.; Faili, H.

Appl. Intell. 2024;54(20):9783-9810

2024

DOI: 10.1007/s10489-024-05660-2 · Ref ID: 3692

Prompt-based learning has emerged as a dominant paradigm in natural language processing. This study explores the impact of diverse pre-training objectives on the performance of encoder-decoder pre-trained language models across generation and question answering tasks, with a focus on commonsense knowledge retrieval and completion. We highlight the benefits of incorporating multiple objectives during both pre-training and fine-tuning stages. We introduce the Match Task to Objective (MTO) framework and methods for determining the appropriate objective for a given task. This framework offers automated methods to prepare task-related data for adaptation through unsupervised training, based on the identified objective. In the fine-tuning stage, we design novel templates that align with the objectives of the pre-training and adaptation stages. When aligned with task requirements, these strategies can achieve a performance gain of over 120% compared to conventional methods in few-shot settings. They significantly outperform related works in few-shot settings and exceed the baseline even in full-dataset scenarios. Furthermore, we extend this approach to include prompt-tuning methodologies, providing guidance for more effective soft prompt engineering and optimization. Our strategies significantly enhance prompt-tuning performance as well. These insights hold substantial value, precisely guiding the selection and optimization of models customized for specific tasks. Code is available at https://github.com/puraminy/MTO/

Ishan voted
Kwesi voted
Final decision
What was the agreed final decision?

#3295 - Pradeep 2024
ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models

Pradeep, Ronak; Lee, Daniel; Mousavi, Ali; Pound, Jeff; Sang, Yisi; Lin, Jimmy; Ilyas, Ihab; Potdar, Saloni; Arefiyan, Mostafa; Li, Yunyao

arXiv 2024;():

2024

Ref ID: 8524

The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present ConvKGYarn, a scalable method for generating up-to-date and configurable conversational KGQA datasets. Qualitative psychometric analyses confirm our method can generate high-quality datasets rivaling a popular conversational KGQA dataset while offering it at scale and covering a wide range of human-interaction configurations. We showcase its utility by testing LLMs on diverse conversations - exploring model behavior on conversational KGQA sets with different configurations grounded in the same KG fact set. Our results highlight the ability of ConvKGYarn to improve KGQA foundations and evaluate parametric knowledge of LLMs, thus offering a robust solution to the constantly evolving landscape of conversational assistants.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3923 - Prasad 2024
Towards Development of Automated Knowledge Maps and Databases for Materials Engineering using Large Language Models

Prasad, Deepak; Pimpude, Mayur; Alankar, Alankar

arXiv 2024;():

2024

Ref ID: 8113

In this work a Large Language Model (LLM) based workflow is presented that utilizes OpenAI ChatGPT model GPT-3.5-turbo-1106 and Google Gemini Pro model to create summary of text, data and images from research articles. It is demonstrated that by using a series of processing, the key information can be arranged in tabular form and knowledge graphs to capture underlying concepts. Our method offers efficiency and comprehension, enabling researchers to extract insights more effectively. Evaluation based on a diverse Scientific Paper Collection demonstrates our approach in facilitating discovery of knowledge. This work contributes to accelerated material design by smart literature review. The method has been tested based on various qualitative and quantitative measures of gathered information. The ChatGPT model achieved an F1 score of 0.40 for an exact match (ROUGE-1, ROUGE-2) but an impressive 0.479 for a relaxed match (ROUGE-L, ROUGE-Lsum) structural data format in performance evaluation. The Google Gemini Pro outperforms ChatGPT with an F1 score of 0.50 for an exact match and 0.63 for a relaxed match. This method facilitates high-throughput development of a database relevant to materials informatics. For demonstration, an example of data extraction and knowledge graph formation based on a manuscript about a titanium alloy is discussed.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3186 - Priyanshu 2023
Are Chatbots Ready for Privacy-Sensitive Applications? An Investigation into Input Regurgitation and Prompt-Induced Sanitization

Priyanshu, Aman; Vijay, Supriti; Kumar, Ayush; Naidu, Rakshit; Mireshghallah, Fatemehsadat

arXiv 2023;():

2023

Ref ID: 7726

LLM-powered chatbots are becoming widely adopted in applications such as healthcare, personal assistants, industry hiring decisions, etc. In many of these cases, chatbots are fed sensitive, personal information in their prompts, as samples for in-context learning, retrieved records from a database, or as part of the conversation. The information provided in the prompt could directly appear in the output, which might have privacy ramifications if there is sensitive information there. As such, in this paper, we aim to understand the input copying and regurgitation capabilities of these models during inference and how they can be directly instructed to limit this copying by complying with regulations such as HIPAA and GDPR, based on their internal knowledge of them. More specifically, we find that when ChatGPT is prompted to summarize cover letters of a 100 candidates, it would retain personally identifiable information (PII) verbatim in 57.4% of cases, and we find this retention to be non-uniform between different subgroups of people, based on attributes such as gender identity. We then probe ChatGPT's perception of privacy-related policies and privatization mechanisms by directly instructing it to provide compliant outputs and observe a significant omission of PII from output.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3672 - Puchert 2023
LLMMaps – A Visual Metaphor for Stratified Evaluation of Large Language Models

Puchert, Patrik; Poonam, Poonam; van Onzenoodt, Christian; Ropinski, Timo

arXiv 2023;():

2023

Ref ID: 7667

Large Language Models (LLMs) have revolutionized natural language processing and demonstrated impressive capabilities in various tasks. Unfortunately, they are prone to hallucinations, where the model exposes incorrect or false information in its responses, which renders diligent evaluation approaches mandatory. While LLM performance in specific knowledge fields is often evaluated based on question and answer (Q&amp;A) datasets, such evaluations usually report only a single accuracy number for the dataset, which often covers an entire field. This field-based evaluation, is problematic with respect to transparency and model improvement. A stratified evaluation could instead reveal subfields, where hallucinations are more likely to occur and thus help to better assess LLMs' risks and guide their further development. To support such stratified evaluations, we propose LLMMaps as a novel visualization technique that enables users to evaluate LLMs' performance with respect to Q&amp;A datasets. LLMMaps provide detailed insights into LLMs' knowledge capabilities in different subfields, by transforming Q&amp;A datasets as well as LLM responses into an internal knowledge structure. An extension for comparative visualization furthermore, allows for the detailed comparison of multiple LLMs. To assess LLMMaps we use them to conduct a comparative analysis of several state-of-the-art LLMs, such as BLOOM, GPT-2, GPT-3, ChatGPT and LLaMa-13B, as well as two qualitative user evaluations. All necessary source code and data for generating LLMMaps to be used in scientific publications and elsewhere is available on GitHub: https://github.com/viscom-ulm/LLMMaps

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#579 - Putman 2023
The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species

Putman, T. E.; Schaper, K.; Matentzoglu, N.; Rubinetti, V. P.; Alquaddoomi, F. S.; Cox, C.; Caufield, J. H.; Elsarboukh, G.; Gehrke, S.; Hegde, H.; Reese, J. T.; Braun, I.; Bruskiewich, R. M.; Cappelletti, L.; Carbon, S.; Caron, A. R.; Chan, L. E.; Chute, C. G.; Cortes, K. G.; De Souza, V.; Fontana, T.; Harris, N. L.; Hartley, E. L.; Hurwitz, E.; Jacobsen, J. O. B.; Krishnamurthy, M.; Laraway, B. J.; McLaughlin, J. A.; McMurry, J. A.; Moxon, S. A. T.; Mullen, K. R.; O'Neil, S. T.; Shefchek, K. A.; Stefancsik, R.; Toro, S.; Vasilevsky, N. A.; Walls, R. L.; Whetzel, P. L.; Osumi-Sutherland, D.; Smedley, D.; Robinson, P. N.; Mungall, C. J.; Haendel, M. A.; Munoz-Torres, M. C.

Nucleic Acids Res. 2023;():12

2023

DOI: 10.1093/nar/gkad1082 · Ref ID: 3322

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app. Graphical Abstract

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1986 - Qi 2023
Traditional Chinese Medicine Prescription Recommendation Model Based on Large Language Models and Graph Neural Networks

Qi, J.; Wang, X.; Yang, T.

Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():4623-4627

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BIBM58861.2023.10385489 · Ref ID: 4970

Background: Traditional Chinese medicine (TCM) has a millennia-long history, offering unique treatments and insights into global health. Given the intricate symptoms and shifting syndrome patterns, prescribing can be tough for young doctors. TCM prescription recommendations can help these doctors address their experience gap. In recent years, with advancements in technologies such as artificial intelligence and big data, intelligent recommendations for TCM prescriptions have become feasible, holding significant implications for enhancing treatment efficacy and optimizing patient experience. Objective: This study aims to establish a novel TCM prescription recommendation model by integrating large language models with Graph Neural Network (GNN) to enhance the accuracy of prescription suggestions. Method: Based on the co-occurrence of symptoms and herbal medicines, we constructed symptom graphs, symptom-herb graphs, and herb-herb graphs. Using Graph Convolutional Network (GCN), we acquired embeddings for both symptoms and herbs. The symptom embeddings are then integrated with insights from large language model embeddings, while auxiliary information from an external knowledge graph is incorporated into the herb embeddings. A final list of herb recommendations was generated by interacting with the embeddings of symptoms and herbs. Results: The proposed algorithm achieved 22.1%, 17.2%, and 13% on the evaluation metrics P@5, P@10, and P@20, respectively. Concurrently, scores for R@5, R@10, and R@20 were 14%, 24%, and 32.5%, respectively. The P@5 metric surpassed the KDHR by 4.7%, and the R@20 metric exceeded the KDHR by 6%. Overall, the performance of our model outperformed other baseline models across various evaluation criteria. Conclusion: The TCM prescription recommendation model, infused with information from a large language model, can effectively enhance the outcomes of TCM prescription recommendations. The study may offer valuable insights for auxiliary clinical research and treatment in TCM. © 2023 IEEE.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3854 - Qi 2024
Safety Control of Service Robots with LLMs and Embodied Knowledge Graphs

Qi, Yong; Kyebambo, Gabriel; Xie, Siyuan; Shen, Wei; Wang, Shenghui; Xie, Bitao; He, Bin; Wang, Zhipeng; Jiang, Shuo

arXiv 2024;():

2024

Ref ID: 8324

Safety limitations in service robotics across various industries have raised significant concerns about the need for robust mechanisms ensuring that robots adhere to safe practices, thereby preventing actions that might harm humans or cause property damage. Despite advances, including the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), challenges in ensuring consistent safety in autonomous robot actions persist. In this paper, we propose a novel integration of Large Language Models with Embodied Robotic Control Prompts (ERCPs) and Embodied Knowledge Graphs (EKGs) to enhance the safety framework for service robots. ERCPs are designed as predefined instructions that ensure LLMs generate safe and precise responses. These responses are subsequently validated by EKGs, which provide a comprehensive knowledge base ensuring that the actions of the robot are continuously aligned with safety protocols, thereby promoting safer operational practices in varied contexts. Our experimental setup involved diverse real-world tasks, where robots equipped with our framework demonstrated significantly higher compliance with safety standards compared to traditional methods. This integration fosters secure human-robot interactions and positions our methodology at the forefront of AI-driven safety innovations in service robotics.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3436 - Qi 2023
FoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph Prompt

Qi, Zhixiao; Yu, Yijiong; Tu, Meiqi; Tan, Junyi; Huang, Yongfeng

arXiv 2023;():

2023

Ref ID: 7810

Currently, the construction of large language models in specific domains is done by fine-tuning on a base model. Some models also incorporate knowledge bases without the need for pre-training. This is because the base model already contains domain-specific knowledge during the pre-training process. We build a large language model for food testing. Unlike the above approach, a significant amount of data in this domain exists in Scanning format for domain standard documents. In addition, there is a large amount of untrained structured knowledge. Therefore, we introduce an incremental pre-training step to inject this knowledge into a large language model. In this paper, we propose a method for handling structured knowledge and scanned documents in incremental pre-training. To overcome the problem of machine hallucination, we constructe a knowledge graph to serve as an external knowledge base for supporting retrieval in the large language model. It is worth mentioning that this paper is a technical report of our pre-release version, and we will report our specific experimental data in future versions.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3708 - Qian 2023
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs

Qian, Cheng; Zhao, Xinran; Wu, Sherry Tongshuang

arXiv 2023;():

2023

Ref ID: 7834

Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge. However, in order to remain up-to-date and align with human instructions, LLMs inevitably require external knowledge during their interactions with users. This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge? To investigate this question, we propose a framework that systematically elicits LLM parametric knowledge and introduces external knowledge. Specifically, we uncover the impacts by constructing a parametric knowledge graph to reveal the different knowledge structures of LLMs, and introduce external knowledge through distractors of varying degrees, methods, positions, and formats. Our experiments on both black-box and open-source models demonstrate that LLMs tend to produce responses that deviate from their parametric knowledge, particularly when they encounter direct conflicts or confounding changes of information within detailed contexts. We also find that while LLMs are sensitive to the veracity of external knowledge, they can still be distracted by unrelated information. These findings highlight the risk of hallucination when integrating external knowledge, even indirectly, during interactions with current LLMs. All the data and results are publicly available.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1267 - Qian 2023
Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph

Qian, J.; Li, G.; Atkinson, K.; Yue, Y.

ACM International Conference Proceeding Series 2023;():353-360

Association for Computing Machinery 2023

DOI: 10.1145/3639631.3639689 · Ref ID: 4753

Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#397 - Qiao 2022
A joint model for entity and relation extraction based on BERT

Qiao, B.; Zou, Z. Y.; Huang, Y.; Fang, K.; Zhu, X. H.; Chen, Y. M.

Neural Comput. Appl. 2022;34(5):3471-3481

2022

DOI: 10.1007/s00521-021-05815-z · Ref ID: 3148

In recent years, as the knowledge graph has attained significant achievements in many specific fields, which has become one of the core driving forces for the development of the internet and artificial intelligence. However, there is no mature knowledge graph in the field of agriculture, so it is a great significance study on the construction technology of agricultural knowledge graph. Named entity recognition and relation extraction are key steps in the construction of knowledge graph. In this paper, based on the joint extraction model LSTM-LSTM-Bias brought in BERT pre-training language model to proposed a agricultural entity relationship joint extraction model BERT-BILSTM-LSTM which is applied to the standard data set NYT and self-built agricultural data set AgriRelation. Experimental results showed that the model can effectively extracted the relationship between agricultural entities and entities.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1626 - Qiu 2024
Matching Tabular Data to Knowledge Graph with Effective Core Column Set Discovery

Qiu, J.; Song, A.; Jin, J.; Chen, J.; Zhang, X.; Fang, X.; Zhang, T.

ACM Trans. Web 2024;18(4):

2024

DOI: 10.1145/3694979 · Ref ID: 3851

Matching tabular data to a knowledge graph (KG) is critical for understanding the semantic column types, column relationships, and entities of a table. Existing matching approaches rely heavily on core columns that represent primary subject entities on which other columns in the table depend. However, discovering these core columns before understanding the table's semantics is challenging. Most prior works use heuristic rules, such as the leftmost column, to discover a single core column, while an insightful discovery of the core column set that accurately captures the dependencies between columns is often overlooked. To address these challenges, we introduce Dependency-aware Core Column Set Discovery (DaCo), an iterative method that uses a novel rough matching strategy to identify both inter-column dependencies and the core column set. Additionally, DaCo can be seamlessly integrated with pre-trained language models, as proposed in the optimization module. Unlike other methods, DaCo does not require labeled data or contextual information, making it suitable for real-world scenarios. In addition, it can identify multiple core columns within a table, which is common in real-world tables. We conduct experiments on six datasets, including five datasets with single core columns and one dataset with multiple core columns. Our experimental results show that DaCo outperforms existing core column set detection methods, further improving the effectiveness of table understanding tasks. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#185 - Qiu 2023
DOCUMENT UNDERSTANDING-BASED DESIGN SUPPORT: LANGUAGE MODEL BASED DESIGN KNOWLEDGE EXTRACTION

Qiu, Y. J.; Jin, Y.; Amer Soc Mechanical, Engineers

ASME International Design Engineering Technical Conferences / Computers and Information in Engineering Conference (IDETC-CIE) / 49th Design Automation Conference (DAC) 2023;():

Boston, MA Amer Soc Mechanical Engineers 2023

Ref ID: 3799

Design knowledge in the vast amount of design reports and documents can be a great resource for designers in their practice. However, capturing such domain-specific information embedded in long-length unstructured texts is always time-consuming and sometimes difficult. Therefore, it is highly desirable for a computer system to automatically extract the main knowledge points and their corresponding inner structures from given documents. In this study of document understanding for design support (DocUDS), a design-perspective knowledge extraction approach is proposed that uses phrase-level domain-specific labeled datasets to finetune a Bidirectional Encoder Representation from Transformers (BERT) model so that it can extract design knowledge from documents. The BERT model finetuning attempts to blend in the domain-specific knowledge of well-recognized domain concepts and is based on the datasets generated from design reports. The model is utilized to map the captured sentences to the main design entities <requirement>, <function>, and <solution>. In addition, this approach uncovers inner relationships among the sentences and constructs overall structures of documents to enhance understanding. The definitions of design perspectives, inter-perspective relations, and intra-perspective relations are introduced, which together capture the main design knowledge points and their relations and constitute an understanding of the design domain knowledge of a text. The case study results have demonstrated the proposed approach's effectiveness in understanding and extracting relevant design knowledge points.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#755 - Quiroz-Mercado 2020
Semantic Similarity Estimation Using Vector Symbolic Architectures

Quiroz-Mercado, J. I.; Barrón-Fernández, R.; Ramírez-Salinas, M. A.

IEEE Access 2020;8():109120-109132

2020

DOI: 10.1109/access.2020.3001765 · Ref ID: 3460

For many natural language processing applications, estimating similarity and relatedness between words are key tasks that serve as the basis for classification and generalization. Currently, vector semantic models (VSM) have become a fundamental language modeling tool. VSMs represent words as points in a high-dimensional space and follow the distributional hypothesis of meaning, which assumes that semantic similarity is related to the context. In this paper, we propose a model whose representations are based on the semantic features associated with a concept within the ConceptNet knowledge graph. The proposed model is based on a vector symbolic architecture framework, which defines a set of arithmetic operations to encode the semantic features within a single high-dimensional vector. In addition to word distribution, these vector representations consider several types of information. Moreover, owing to the properties of high-dimensional spaces, they have the additional advantage of being interpretable. We analyze the model's performance on the SimLex-999 dataset, a dataset where commonly used distributional models (e.g., word2vec or GloVe) perform poorly. Our results are similar to those of other hybrid models, and they surpass several state-of-the-art distributional and knowledge-based models.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3431 - Rabby 2024
Fine-tuning and Prompt Engineering with Cognitive Knowledge Graphs for Scholarly Knowledge Organization

Rabby, Gollam; Auer, Sören; D'Souza, Jennifer; Oelen, Allard

arXiv 2024;():

2024

Ref ID: 8587

The increasing amount of published scholarly articles, exceeding 2.5 million yearly, raises the challenge for researchers in following scientific progress. Integrating the contributions from scholarly articles into a novel type of cognitive knowledge graph (CKG) will be a crucial element for accessing and organizing scholarly knowledge, surpassing the insights provided by titles and abstracts. This research focuses on effectively conveying structured scholarly knowledge by utilizing large language models (LLMs) to categorize scholarly articles and describe their contributions in a structured and comparable manner. While previous studies explored language models within specific research domains, the extensive domain-independent knowledge captured by LLMs offers a substantial opportunity for generating structured contribution descriptions as CKGs. Additionally, LLMs offer customizable pathways through prompt engineering or fine-tuning, thus facilitating to leveraging of smaller LLMs known for their efficiency, cost-effectiveness, and environmental considerations. Our methodology involves harnessing LLM knowledge, and complementing it with domain expert-verified scholarly data sourced from a CKG. This strategic fusion significantly enhances LLM performance, especially in tasks like scholarly article categorization and predicate recommendation. Our method involves fine-tuning LLMs with CKG knowledge and additionally injecting knowledge from a CKG with a novel prompting technique significantly increasing the accuracy of scholarly knowledge extraction. We integrated our approach in the Open Research Knowledge Graph (ORKG), thus enabling precise access to organized scholarly knowledge, crucially benefiting domain-independent scholarly knowledge exchange and dissemination among policymakers, industrial practitioners, and the general public.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3274 - Radha 2024
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners

Radha, Santosh Kumar; Goktas, Oktay

arXiv 2024;():

2024

Ref ID: 8685

Human learning thrives on the ability to learn from mistakes, adapt through feedback, and refine understanding-processes often missing in static machine learning models. In this work, we introduce Composite Learning Units (CLUs) designed to transform reasoners, such as Large Language Models (LLMs), into learners capable of generalized, continuous learning without conventional parameter updates while enhancing their reasoning abilities through continual interaction and feedback. CLUs are built on an architecture that allows a reasoning model to maintain and evolve a dynamic knowledge repository: a General Knowledge Space for broad, reusable insights and a Prompt-Specific Knowledge Space for task-specific learning. Through goal-driven interactions, CLUs iteratively refine these knowledge spaces, enabling the system to adapt dynamically to complex tasks, extract nuanced insights, and build upon past experiences autonomously. We demonstrate CLUs' effectiveness through a cryptographic reasoning task, where they continuously evolve their understanding through feedback to uncover hidden transformation rules. While conventional models struggle to grasp underlying logic, CLUs excel by engaging in an iterative, goal-oriented process. Specialized components-handling knowledge retrieval, prompt generation, and feedback analysis-work together within a reinforcing feedback loop. This approach allows CLUs to retain the memory of past failures and successes, adapt autonomously, and apply sophisticated reasoning effectively, continually learning from mistakes while also building on breakthroughs.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1026 - Rajpal 2023
BERTologyNavigator: Advanced Question Answering with BERT-based Semantics

Rajpal, S.; Usbeck, R.

CEUR Workshop Proceedings 2023;3592():

CEUR-WS 2023

Ref ID: 5094

The development and integration of knowledge graphs and language models has significance in artificial intelligence and natural language processing. In this study, we introduce the BERTologyNavigator- a two-phased system that combines relation extraction techniques and BERT embeddings to navigate the relationships within the DBLP Knowledge Graph (KG). Our approach focuses on extracting one-hop relations and labelled candidate pairs in the first phases. This is followed by employing BERT's CLS embeddings and additional heuristics for relation selection in the second phase. Our system reaches an F1 score of 0.2175 on the DBLP QuAD Final test dataset for Scholarly QALD and 0.98 F1 score on the subset of the DBLP QuAD test dataset during the QA phase. © 2023 CEUR-WS. All rights reserved.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#42 - Ramchand 2024
Augmenting Infrequent Relationships in Clinical Language Models with Graph-Encoded Hierarchical Ontologies

Ramchand, S.; Xie, X. H.

1st International Conference on Artificial Intelligence in Healthcare (AIiH) 2024;14975():31-44

Swansea, ENGLAND Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-67278-1_3 · Ref ID: 3195

Harnessing primary-care data can facilitate earlier clinical interventions via predictive modelling. Nonetheless, the intricacy of medical terminology and the breadth of ontological data often obscure the inner workings of such models. Despite the growing complexity of artificial intelligence methodologies and the pressing demand for medical tools that seamlessly integrate into clinical workflows, this opacity persists. We propose enhancing clinical Bidirectional Encoder Representations from Transformers (BERT) models with graph attention networks that encode diagnosis and medication concept hierarchies derived from primary care data. In 10-fold cross-validation on cardiovascular and respiratory detection tasks, our graph-enhanced model marginally improves F1 performance over baseline BERT. More importantly, our approach surfaces clinically deterministic patterns in patient groups, provides modular visualisations of influential terminal and ancestral medical concepts, and improves clustering of related conditions. Additionally, the hierarchical encoding allows quantitative analysis of edge relevance within and across diagnosis and medical ontologies. Our research shows that injecting structured knowledge graphs into language model architectures can improve performance through domain-specific regularisation. Additionally, the use of class activation maps throughout the approach allows for richer interpretations of predictions by following activation flows along concept relationships. The dual utility of precise ontology encoding and Large Language Models makes our graph-injected clinical language model more accurate and trustworthy, propelling preventive precision medicine forward.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3878 - Rangel 2024
SPARQL Generation: an analysis on fine-tuning OpenLLaMA for Question Answering over a Life Science Knowledge Graph

Rangel, Julio C.; de Farias, Tarcisio Mendes; Sima, Ana Claudia; Kobayashi, Norio

arXiv 2024;():

2024

Ref ID: 8079

The recent success of Large Language Models (LLM) in a wide range of Natural Language Processing applications opens the path towards novel Question Answering Systems over Knowledge Graphs leveraging LLMs. However, one of the main obstacles preventing their implementation is the scarcity of training data for the task of translating questions into corresponding SPARQL queries, particularly in the case of domain-specific KGs. To overcome this challenge, in this study, we evaluate several strategies for fine-tuning the OpenLlama LLM for question answering over life science knowledge graphs. In particular, we propose an end-to-end data augmentation approach for extending a set of existing queries over a given knowledge graph towards a larger dataset of semantically enriched question-to-SPARQL query pairs, enabling fine-tuning even for datasets where these pairs are scarce. In this context, we also investigate the role of semantic "clues" in the queries, such as meaningful variable names and inline comments. Finally, we evaluate our approach over the real-world Bgee gene expression knowledge graph and we show that semantic clues can improve model performance by up to 33% compared to a baseline with random variable names and no comments included.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#52 - Rawsthorne 2023
Automatic Nested Spatial Entity and Spatial Relation Extraction From Text for Knowledge Graph Creation: A Baseline Approach and a Benchmark Dataset

Rawsthorne, H. M.; Abadie, N.; Kergosien, E.; Duchêne, C.; Saux, É

7th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities) 2023;():21-30

Hamburg, GERMANY Assoc Computing Machinery 2023

DOI: 10.1145/3615887.3627754 · Ref ID: 3264

Automatically extracting geographic information from text is the key to harnessing the vast amount of spatial knowledge that only exists in this unstructured form. The fundamental elements of spatial knowledge include spatial entities, their types and the spatial relations between them. Structuring the spatial knowledge contained within text as a geospatial knowledge graph, and disambiguating the spatial entities, significantly facilitates its reuse. The automatic extraction of geographic information from text also allows the creation or enrichment of gazetteers. We propose a baseline approach for nested spatial entity and binary spatial relation extraction from text, a new annotated French-language benchmark dataset on the maritime domain that can be used to train algorithms for both extraction tasks, and benchmark results for the two tasks carried out individually and end-to-end. Our approach involves applying the Princeton University Relation Extraction system (PURE), made for flat, generic entity extraction and generic binary relation extraction, to the extraction of nested, spatial entities and spatial binary relations. By extracting nested spatial entities and the spatial relations between them, we have more information to aid entity disambiguation. In our experiments we compare the performance of a pretrained monolingual French BERT language model with that of a pretrained multilingual BERT language model, and study the effect of including cross-sentence context. Our results reveal very similar results for both models, although the multilingual model performs slightly better in entity extraction, and the monolingual model has slightly better relation extraction and end-to-end performances. We observe that increasing the amount of cross-sentence context improves the results for entity extraction whereas it has the opposite effect on relation extraction.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1997 - Rawte 2024
Tutorial Proposal: Hallucination in Large Language Models

Rawte, V.; Chadha, A.; Sheth, A.; Das, A.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Tutorial Summaries 2024;():68-72

European Language Resources Association (ELRA) 2024

Ref ID: 4651

In the fast-paced domain of Large Language Models (LLMs), the issue of hallucination is a prominent challenge. Despite continuous endeavors to address this concern, it remains a highly active area of research within the LLM landscape. Grasping the intricacies of this problem can be daunting, especially for those new to the field. This tutorial aims to bridge this knowledge gap by introducing the emerging realm of hallucination in LLMs. It will comprehensively explore the key aspects of hallucination, including benchmarking, detection, and mitigation techniques. Furthermore, we will delve into the specific constraints and shortcomings of current approaches, providing valuable insights to guide future research efforts for participants. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#354 - Razouk 2023
Improving FMEA Comprehensibility via Common-Sense Knowledge Graph Completion Techniques

Razouk, H.; Liu, X. L.; Kern, R.

IEEE Access 2023;11():127974-127986

2023

DOI: 10.1109/access.2023.3331585 · Ref ID: 3083

The Failure Mode Effect Analysis process (FMEA) is widely used in industry for risk assessment, as it effectively captures and documents domain-specific knowledge. This process is mainly concerned with causal domain knowledge. In practical applications, FMEAs encounter challenges in terms of comprehensibility, particularly related to inadequate coverage of listed failure modes and their corresponding effects and causes. This can be attributed to the limitations of traditional brainstorming approaches typically employed in the FMEA process. Depending on the size and diversity in terms of disciplines of the team conducting the analysis, these approaches may not adequately capture a comprehensive range of failure modes, leading to gaps in coverage. To this end, methods for improving FMEA knowledge comprehensibility are highly needed. A potential approach to address this gap is rooted in recent advances in common-sense knowledge graph completion, which have demonstrated the effectiveness of text-aware graph embedding techniques. However, the applicability of such methods in an industrial setting is limited. This paper addresses this issue on FMEA documents in an industrial environment. Here, the application of common-sense knowledge graph completion methods on FMEA documents from semiconductor manufacturing is studied. These methods achieve over 20% MRR on the test set and 70% of the top 10 predictions were manually assessed to be plausible by domain experts. Based on the evaluation, this paper confirms that text-aware knowledge graph embedding for common-sense knowledge graph completion are more effective than structure-only knowledge graph embedding for improving FMEA knowledge comprehensibility. Additionally we found that language model in domain fine-tuning is beneficial for extracting more meaningful embedding, thus improving the overall model performance.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1347 - Regino 2024
Generating E-commerce Related Knowledge Graph from Text: Open Challenges and Early Results using LLMs

Regino, A. G.; Cesar dos Reis, J.

CEUR Workshop Proceedings 2024;3747():18

CEUR-WS 2024

Ref ID: 4360

E-commerce systems need to use and manage vast amounts of unstructured textual data. This poses significant challenges for knowledge representation, information retrieval, and recommendation tasks. This study investigates the generation of E-commerce-related Knowledge Graphs (KGs) from text. In particular, we explore using Large Language Models (LLMs). Our approach integrates ontology with text-based examples from existing KGs via prompts to create structured RDF triples. We outline a four-step method encompassing text classification, extracting relevant characteristics, generating RDF triples, and assessing the generated triples. Each step leverages LLM instructions to process unstructured text. We discuss the insights, challenges, and potential future directions, highlighting the significance of integrating ontology elements with unstructured text for generating semantically enriched KGs. Through case experimentations, we demonstrate the effectiveness and applicability of our solution in the E-commerce domain. © 2024 Copyright for this paper by its authors.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2800 - Rehmat 2020
Predicting the pathogenicity of protein coding mutations using Natural Language Processing

Rehmat, N.; Farooq, H.; Kumar, S.; Hussain, S. ul; Naveed, H.

2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2020;():5842-5846

2020

DOI: 10.1109/EMBC44109.2020.9175781 · Ref ID: 6681

DNA-Sequencing of tumor cells has revealed thousands of genetic mutations. However, cancer is caused by only some of them. Identifying mutations that contribute to tumor growth from neutral ones is extremely challenging and is currently carried out manually. This manual annotation is very cumbersome and expensive in terms of time and money. In this study, we introduce a novel method "NLP-SNPPred" to read scientific literature and learn the implicit features that cause certain variations to be pathogenic. Precisely, our method ingests the bio-medical literature and produces its vector representation via exploiting state of the art NLP methods like sent2vec, word2vec and tf-idf. These representations are then fed to machine learning predictors to identify the pathogenic versus neutral variations. Our best model (NLPSNPPred) trained on OncoKB and evaluated on several publicly available benchmark datasets, outperformed state of the art function prediction methods. Our results show that NLP can be used effectively in predicting functional impact of protein coding variations with minimal complementary biological features. Moreover, encoding biological knowledge into the right representations, combined with machine learning methods can help in automating manual efforts. A free to use web-server is available at http://www.nlp-snppred.cbrlab.org.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1391 - Ren 2024
Identifying Semantic Induction Heads to Understand In-Context Learning

Ren, J.; Guo, Q.; Yan, H.; Liu, D.; Zhang, Q.; Qiu, X.; Lin, D.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():6916-6932

Association for Computational Linguistics (ACL) 2024

Ref ID: 4530

Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#2121 - Ren 2020
API-Misuse Detection Driven by Fine-Grained API-Constraint Knowledge Graph

Ren, X.; Ye, X.; Xing, Z.; Xia, X.; Xu, X.; Zhu, L.; Sun, J.

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2020;():461-472

2020

Ref ID: 6115

API misuses cause significant problem in software development. Existing methods detect API misuses against frequent API usage patterns mined from codebase. They make a naive assumption that API usage that deviates from the most-frequent API usage is a misuse. However, there is a big knowledge gap between API usage patterns and API usage caveats in terms of comprehensiveness, explainability and best practices. In this work, we propose a novel approach that detects API misuses directly against the API caveat knowledge, rather than API usage patterns. We develop open information extraction methods to construct a novel API-constraint knowledge graph from API reference documentation. This knowledge graph explicitly models two types of API-constraint relations (call-order and condition-checking) and enriches return and throw relations with return conditions and exception triggers. It empowers the detection of three types of frequent API misuses - missing calls, missing condition checking and missing exception handling, while existing detectors mostly focus on only missing calls. As a proof-of-concept, we apply our approach to Java SDK API Specification. Our evaluation confirms the high accuracy of the extracted API-constraint relations. Our knowledge-driven API misuse detector achieves 0.60 (68/113) precision and 0.28 (68/239) recall for detecting Java API misuses in the API misuse benchmark MuBench. This performance is significantly higher than that of existing pattern-based API misused detectors. A pilot user study with 12 developers shows that our knowledge-driven API misuse detection is very promising in helping developers avoid API misuses and debug the bugs caused by API misuses.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3558 - Ren 2023
Joint Semantic and Structural Representation Learning for Enhancing User Preference Modelling

Ren, Xuhui; Yuan, Wei; Chen, Tong; Yang, Chaoqun; Nguyen, Quoc Viet Hung; Yin, Hongzhi

arXiv 2023;():

2023

Ref ID: 7682

Knowledge graphs (KGs) have become important auxiliary information for helping recommender systems obtain a good understanding of user preferences. Despite recent advances in KG-based recommender systems, existing methods are prone to suboptimal performance due to the following two drawbacks: 1) current KG-based methods over-emphasize the heterogeneous structural information within a KG and overlook the underlying semantics of its connections, hindering the recommender from distilling the explicit user preferences; and 2) the inherent incompleteness of a KG (i.e., missing facts, relations and entities) will deteriorate the information extracted from KG and weaken the representation learning of recommender systems. To tackle the aforementioned problems, we investigate the potential of jointly incorporating the structural and semantic information within a KG to model user preferences in finer granularity. A new framework for KG-based recommender systems, namely ?nowledge ?nfomax ?ecommender ?ystem with ?ontrastive ?earning (KIRS-CL) is proposed in this paper. Distinct from previous KG-based approaches, KIRS-CL utilizes structural and connectivity information with high-quality item embeddings learned by encoding KG triples with a pre-trained language model. These well-trained entity representations enable KIRS-CL to find the item to recommend via the preference connection between the user and the item. Additionally, to improve the generalizability of our framework, we introduce a contrastive warm-up learning strategy, making it capable of dealing with both warm- and cold-start recommendation scenarios. Extensive experiments on two real-world datasets demonstrate remarkable improvements over state-of-the-art baselines.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#272 - Ren 2021
Fake News Detection on News-Oriented Heterogeneous Information Networks through Hierarchical Graph Attention

Ren, Y. X.; Zhang, J. W.; Ieee

International Joint Conference on Neural Networks (IJCNN) 2021;():

Electr Network Ieee 2021

DOI: 10.1109/ijcnn52387.2021.9534362 · Ref ID: 3690

The viral spread of fake news has caused great social harm, making fake news detection an urgent task. Current fake news detection methods rely heavily on text information by learning the extracted news content or writing style of internal knowledge. However, deliberate rumors can mask writing style, bypassing language models and invalidating simple text-based models. In fact, news articles and other related components (such as news creators and news topics) can be modeled as a heterogeneous information network (HIN for short). In this paper, we propose a novel fake news detection framework, namely Hierarchical Graph Attention Network (HGAT), which uses a novel hierarchical attention mechanism to perform node representation learning in HIN, and then detects fake news by classifying news article nodes. Experiments on two real-world fake news datasets show that HGAT can outperform text-based models and other network-based models. In addition, the experiments prove the expandability and generalizability of our for graph representation learning and other node classification related applications in heterogeneous graphs.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1969 - Ren 2023
Towards Informative Open-ended Text Generation with Dynamic Knowledge Triples

Ren, Z.; Zhao, Y.; Zong, C.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():3189-3203

Association for Computational Linguistics (ACL) 2023

Ref ID: 5069

Pretrained language models (PLMs), especially large language models (LLMs) demonstrate impressive capabilities in open-ended text generation. While our statistical results show that LLMs often suffer from over-concentrated information, where the generated texts overly focus on the given prompt and fail to provide sufficient background and detailed information as humans do. To address this issue, we propose a dynamic knowledge-guided informative open-ended text generation approach, that utilizes a knowledge graph to help the model generate more contextually related entities and detailed facts. Specifically, we first employ a local knowledge filter to extract relevant knowledge from the comprehensive knowledge graph for a given topic sentence. Then we introduce a dynamic knowledge selector to predict the entity to be mentioned in the subsequent sentence. Finally, we utilize a knowledge-enhanced text generator to produce a more informative output. To evaluate the effectiveness of our approach, we evaluate the proposed approach in two scenarios: fine-tuning for small PLMs and prompt tuning for LLMs. Experimental results show that our approach could generate more informative texts than baselines. © 2023 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1275 - Riaz 2023
Entity Typing with Triples Using Language Models

Riaz, A.; Abdollahi, S.; Gottschalk, S.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13998 LNCS():169-173

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-43458-7_32 · Ref ID: 5143

Entity Typing is the task of assigning a type to an entity in a knowledge graph. In this paper, we propose ETwT (Entity Typing with Triples), which leverages the triples of an entity, namely its label, description and the property labels used on it. We analyse which language models and classifiers are best suited to this input and compare ETwT’s performance on coarse-grained and fine-grained entity typing. Our evaluation demonstrates that ETwT is able to predict coarse-grained entity types with an F $$:1$$ score of 0.994, outperforming three baselines. © The Author(s), under exclusive license to Springer Nature Switzerland AG. 2023.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2027 - Ringwald 2024
Well-Written Knowledge Graphs: Most Effective RDF Syntaxes for Triple Linearization in End-to-End Extraction of Relations from Texts

Ringwald, C.; Gandon, F.; Faron, C.; Michel, F.; Akl, H. A.

Proceedings of the AAAI Conference on Artificial Intelligence 2024;38():23631-23632

Association for the Advancement of Artificial Intelligence 2024

DOI: 10.1609/aaai.v38i21.30502 · Ref ID: 4049

Seq-to-seq generative models recently gained attention for solving the relation extraction task. By approaching this problem as an end-to-end task, they surpassed encoder-based-only models. Little research investigated the effects of the output syntaxes on the training process of these models. Moreover, a limited number of approaches were proposed for extracting ready-to-load knowledge graphs following the RDF standard. In this paper, we consider that a set of triples can be linearized in many different ways, and we evaluate the combined effect of the size of the language models and different RDF syntaxes on the task of relation extraction from Wikipedia abstracts. Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#881 - Ringwald 2024
Well-Written Knowledge Graphs: Most Effective RDF Syntaxes for Triple Linearization in End-to-End Extraction of Relations from Texts (Student Abstract)

Ringwald, C.; Gandon, F.; Faron, C.; Michel, F.; Akl, H. A.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():23631-23632

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3130

Seq-to-seq generative models recently gained attention for solving the relation extraction task. By approaching this problem as an end-to-end task, they surpassed encoder-based-only models. Little research investigated the effects of the output syntaxes on the training process of these models. Moreover, a limited number of approaches were proposed for extracting ready-to-load knowledge graphs following the RDF standard. In this paper, we consider that a set of triples can be linearized in many different ways, and we evaluate the combined effect of the size of the language models and different RDF syntaxes on the task of relation extraction from Wikipedia abstracts.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#703 - Ristoski 2016
RDF2Vec: RDF Graph Embeddings for Data Mining

Ristoski, P.; Paulheim, H.

15th International Semantic Web Conference (ISWC) 2016;9981():498-514

Kobe, JAPAN Springer International Publishing Ag 2016

DOI: 10.1007/978-3-319-46523-4_30 · Ref ID: 3408

Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by leveraging local information from graph substructures, harvested by Weisfeiler-Lehman Subtree RDF Graph Kernels and graph walks, and learn latent numerical representations of entities in RDF graphs. Our evaluation shows that such vector representations outperform existing techniques for the propositionalization of RDF graphs on a variety of different predictive machine learning tasks, and that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3621 - Robert 2021
Language Models as a Knowledge Source for Cognitive Agents

Robert E. Wray, I. I. I.; Kirk, James R.; Laird, John E.

arXiv 2021;():

2021

Ref ID: 7482

Language models (LMs) are sentence-completion engines trained on massive corpora. LMs have emerged as a significant breakthrough in natural-language processing, providing capabilities that go far beyond sentence completion including question answering, summarization, and natural-language inference. While many of these capabilities have potential application to cognitive systems, exploiting language models as a source of task knowledge, especially for task learning, offers significant, near-term benefits. We introduce language models and the various tasks to which they have been applied and then review methods of knowledge extraction from language models. The resulting analysis outlines both the challenges and opportunities for using language models as a new knowledge source for cognitive systems. It also identifies possible ways to improve knowledge extraction from language models using the capabilities provided by cognitive systems. Central to success will be the ability of a cognitive agent to itself learn an abstract model of the knowledge implicit in the LM as well as methods to extract high-quality knowledge effectively and efficiently. To illustrate, we introduce a hypothetical robot agent and describe how language models could extend its task knowledge and improve its performance and the kinds of knowledge and methods the agent can use to exploit the knowledge within a language model.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#1441 - Rockstroh 2023
A is the B of C: (Semi)-Automatic Creation of Vossian Antonomasias

Rockstroh, J.; D'Ippolito, G.; Lazzari, N.; Oudshoorn, A. M.; Purohit, D.; Raoufi, E.; Rudolph, S.

CEUR Workshop Proceedings 2023;3640():

CEUR-WS 2023

Ref ID: 4968

A Vossian Antonomasia (VA) is a stylistic device used to describe a person (or, more generally, an entity) in terms of a well-known person and a modifying context. For instance, the Norwegian chess world champion Magnus Carlsen was described as "the Mozart of chess"[1]. All VAs follow the pattern where a source (e.g., "Mozart"), is used to describe a target, (e.g., "Magnus Carlsen"), and the transfer of meaning is "channeled"through the use of the modifier "of chess". Although this rhetorical figure is well-known, there has not yet been a dedicated study of targeted automatic or semi-automatic methods to generate and judge the appropriateness of VAs using large Knowledge Graphs (KGs) such as Wikidata. In our work, we propose the use of vector space embeddings - both KG-based and text-based - for producing VAs. For comparison, we contrast our findings with a purely LLM-based approach, wherein VAs are obtained from ChatGPT using a reasonably engineered prompt. We provide a publicly available GitHub repository1 for the implementation of our method and a website2 that allows testing the proposed methods. © 2023 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2457 - Romeikat 2011
Formal Specification of Domain-Specific ECA Policy Models

Romeikat, R.; Bauer, B.

2011 Fifth International Conference on Theoretical Aspects of Software Engineering 2011;():209-212

2011

DOI: 10.1109/TASE.2011.29 · Ref ID: 6234

Policy-based management allows to adapt systems to changed requirements in a flexible and automated way. Policy development usually starts with the specification of high-level policies, which are then refined into a low-level representation. We use models to specify event-condition-action (ECA) policies at different levels of abstraction and consequently separate domain and policy aspects from each other. Domain-specific concepts are used within policies in their event, condition, and action parts. We present a formal specification of the models by means of a relational algebra. The algebra is used to validate the models at each level. Finally, executable policy code is generated from the low-level models.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#847 - Rony 2022
Tree-KGQA: An Unsupervised Approach for Question Answering Over Knowledge Graphs

Rony, Mdra; Chaudhuri, D.; Usbeck, R.; Lehmann, J.

IEEE Access 2022;10():50467-50478

2022

DOI: 10.1109/access.2022.3173355 · Ref ID: 3121

Most Knowledge Graph-based Question Answering (KGQA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose Tree-KGQA, an unsupervised KGQA system leveraging pre-trained language models and tree-based algorithms. Entity and relation linking are essential components of any KGQA system. We employ several pre-trained language models in the entity linking task to recognize the entities mentioned in the question and obtain the contextual representation for indexing. Furthermore, for relation linking we incorporate a pre-trained language model previously trained for language inference task. Finally, we introduce a novel algorithm for extracting the answer entities from a KG, where we construct a forest of interpretations and introduce tree-walking and tree disambiguation techniques. Our algorithm uses the linked relation and predicts the tree branches that eventually lead to the potential answer entities. The proposed method achieves 4.5% and 7.1% gains in F1 score in entity linking tasks on LC-QuAD 2.0 and LC-QuAD 2.0 (KBpearl) datasets, respectively, and a 5.4% increase in the relation linking task on LC-QuAD 2.0 (KBpearl). The comprehensive evaluations demonstrate that our unsupervised KGQA approach outperforms other supervised state-of-the-art methods on the WebQSP-WD test set (1.4% increase in F1 score) - without training on the target dataset.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1784 - Rosati 2016
RDF graph embeddings for content-based recommender systems

Rosati, J.; Ristoski, P.; Di Noia, T.; De Leone, R.; Paulheim, H.

CEUR Workshop Proceedings 2016;1673():23-30

CEUR-WS 2016

Ref ID: 5809

Linked Open Data has been recognized as a useful source of background knowledge for building content-based recommender systems. Vast amount of RDF data, covering multiple domains, has been published in freely accessible datasets. In this paper, we present an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs used for building content-based recommender system. We generate sequences by leveraging local information from graph sub-structures and learn latent numerical representations of entities in RDF graphs. Our evaluation on two datasets in the domain of movies and books shows that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be effectively used in content-based recommender systems.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2975 - Roslovtsev 2013
A synthetic approach to building a canonical model of subject areas in the integration bus

Roslovtsev, V.; Shumsky, L.; Evgeny, B.; Anastasya, B.; Kazantsev, N.

2013 3rd International Symposium ISKO-Maghreb 2013;():1-7

2013

DOI: 10.1109/ISKO-Maghreb.2013.6728118 · Ref ID: 6449

This paper is dedicated to the implementation considerations of a canonical model of subject areas in the integration bus and to the definition of data mapping corresponding to this model. The proposed approach to transforming data, when transferring them between individual applications or services of the system, is to convert input messages into output messages using an intermediate canonical representation of the data via rules that map the source model to the canonical one, and the canonical model to the target. Since the canonical model of a subject area serves a technical important task and is not intended for human use, it may be generated automatically instead of manually and designed `from scratch', as a `union' of models used in the various application parts of the system. Subject area models of the application parts being integrated may be written using different formalisms, and yet another formalism may be used for the canonical model, so that a mechanism is required to automatically capture various concepts expressed in various formal systems. In the present paper we focus on developing such a mechanism, based on the automatic generation of a (somewhat simplified) representation of the most important kinds of entities in the canonical model.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2482 - Rožanc 2013
Framework for web application domain knowledge extraction

Rožanc, I.

2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) 2013;():705-710

2013

Ref ID: 6182

A decade ago a web application e-Student was built with aim to provide electronic support for student enrolment and examination/alumni records management at the University of Ljubljana. Due to issues emerging from the Bologna reform a new e-Student is to be build using a modern technology in the near future. The old e-Student encapsulates a huge amount of domain knowledge. Unfortunately, it was developed using agile approach resulting in poor technical documentation, thus an alternative approach for the domain knowledge extraction has to be defined. In the paper a framework for an effective web application domain knowledge extraction is defined. It has five elements. The main principles (1) of extraction are defined to perform effective reengineering of different application views at a defined abstract level. A proper knowledge representation using diverse models (2) has to be determined next, and the Model Driven Architecture using UML models is considered a suitable choice. The procedure (3) for extraction has to be defined using appropriate (usually custom made) tools (4) and performed by skilled staff (5), possibly members of the old development team. The use of framework is demonstrated on the web application e-Student outlining several custom made tools, the results and the most valuable lessons learnt.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2198 - Rybiński 2022
Beyond Low-Code Development: Marrying Requirements Models and Knowledge Representations

Rybiński, K.; Śmiałek, M.

2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS) 2022;():919-928

2022

DOI: 10.15439/2022F129 · Ref ID: 6276

Typical Low-Code Development platforms enable model-driven generation of web applications from high-level visual notations. They normally express the UI and the application logic, which allows generating the frontend and basic CRUD operations. However, more complex domain logic (data processing) operations still necessitate the use of traditional programming. This paper presents a visual language, called RSL-DL, to represent domain knowledge with complex domain rules aligned with requirements models. The language synthesises and extends approaches found in knowledge representation (ontologies) and software modelling language engineering. Its purpose is to enable a fully automatic generation of domain logic code by reasoning over and reusing domain knowledge. The language’s abstract syntax is defined using a meta-model expressed in MOF. Its semantics is expressed with several translational rules that map RSL-DL models onto typical programming language constructs. The rules are explained informally in natural language and formalised using a graphical transformation notation. It is also supported by introducing an inference engine that enables processing queries to domain models and selecting appropriate invocations to generated code. The presented language was implemented by building a dedicated model editor and transformation engine. It was also initially validated through usability studies. Based on these results, we conclude that declarative knowledge representations can be successfully used to produce imperative back-end code with non-trivial logic.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3287 - Saberi 2024
Context-Augmented Code Generation Using Programming Knowledge Graphs

Saberi, Iman; Fard, Fatemeh

arXiv 2024;():

2024

Ref ID: 8747

Large Language Models (LLMs) and Code-LLMs (CLLMs) have significantly improved code generation, but, they frequently face difficulties when dealing with challenging and complex problems. Retrieval-Augmented Generation (RAG) addresses this issue by retrieving and integrating external knowledge at the inference time. However, retrieval models often fail to find most relevant context, and generation models, with limited context capacity, can hallucinate when given irrelevant data. We present a novel framework that leverages a Programming Knowledge Graph (PKG) to semantically represent and retrieve code. This approach enables fine-grained code retrieval by focusing on the most relevant segments while reducing irrelevant context through a tree-pruning technique. PKG is coupled with a re-ranking mechanism to reduce even more hallucinations by selectively integrating non-RAG solutions. We propose two retrieval approaches-block-wise and function-wise-based on the PKG, optimizing context granularity. Evaluations on the HumanEval and MBPP benchmarks show our method improves pass@1 accuracy by up to 20%, and outperforms state-of-the-art models by up to 34% on MBPP. Our contributions include PKG-based retrieval, tree pruning to enhance retrieval precision, a re-ranking method for robust solution selection and a Fill-in-the-Middle (FIM) enhancer module for automatic code augmentation with relevant comments and docstrings.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#711 - Safavi 2021
Relational World Knowledge Representation in Contextual Language Models: A Review

Safavi, T.; Koutra, D.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021;():1053-1067

Punta Cana, DOMINICAN REP Assoc Computational Linguistics-Acl 2021

Ref ID: 3589

Relational knowledge bases (KBs) are commonly used to represent world knowledge in machines. However, while advantageous for their high degree of precision and interpretability, KBs are usually organized according to manually-defined schemas, which limit their expressiveness and require significant human efforts to engineer and maintain. In this review, we take a natural language processing perspective to these limitations, examining how they may be addressed in part by training deep contextual language models (LMs) to internalize and express relational knowledge in more flexible forms. We propose to organize knowledge representation strategies in LMs by the level of KB supervision provided, from no KB supervision at all to entity- and relation-level supervision. Our contributions are threefold: (1) We provide a high-level, extensible taxonomy for knowledge representation in LMs; (2) Within our taxonomy, we highlight notable models, evaluation tasks, and findings, in order to provide an up-to-date review of current knowledge representation capabilities in LMs; and (3) We suggest future research directions that build upon the complementary aspects of LMs and KBs as knowledge representations.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3307 - Saha 2023
A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

Saha, Anik; Hassanzadeh, Oktie; Gittens, Alex; Ni, Jian; Srinivas, Kavitha; Yener, Bulent

arXiv 2023;():

2023

Ref ID: 7795

Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction. Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task compared to previous state-of-the-art models with complex architectures. We observe that span based models perform better than simple sequence tagging models based on BERT across all 4 data sets from diverse domains with different types of cause-effect phrases.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#2169 - Saini 2021
Automated Traceability for Domain Modelling Decisions Empowered by Artificial Intelligence

Saini, R.; Mussbacher, G.; Guo, J. L. C.; Kienzle, J.

2021 IEEE 29th International Requirements Engineering Conference (RE) 2021;():173-184

2021

DOI: 10.1109/RE51729.2021.00023 · Ref ID: 6300

Domain modelling abstracts real-world entities and their relationships in the form of class diagrams for a given domain problem space. Modellers often perform domain modelling to reduce the gap between understanding the problem description which expresses requirements in natural language and the concise interpretation of these requirements. However, the manual practice of domain modelling is both time-consuming and error-prone. These issues are further aggravated when problem descriptions are long, which makes it hard to trace modelling decisions from domain models to problem descriptions or vice-versa leading to completeness and conciseness issues. Automated support for tracing domain modelling decisions in both directions is thus advantageous. In this paper, we propose an automated approach that uses artificial intelligence techniques to extract domain models along with their trace links. We present a traceability information model to enable traceability of modelling decisions in both directions and provide its proof-of-concept in the form of a tool. The evaluation on a set of unseen problem descriptions shows that our approach is promising with an overall median F2 score of 82.04%. We conduct an exploratory user study to assess the benefits and limitations of our approach and present the lessons learned from this study.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1208 - Sakai 2024
Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?

Sakai, Y.; Kamigaito, H.; Hayashi, K.; Watanabe, T.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():8084-8099

Association for Computational Linguistics (ACL) 2024

Ref ID: 4443

Knowledge graphs (KGs) consist of links that describe relationships between entities. Due to the difficulty of manually enumerating all relationships between entities, automatically completing them is essential for KGs. Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG. Traditional embedding-based KGC methods (e.g. RESCAL, TransE, DistMult, ComplEx, RotatE, HAKE, HousE, etc.) infer missing links using only the knowledge from training data. In contrast, the recent Pre-trained Language Model (PLM)-based KGC utilizes knowledge obtained during pre-training, which means it can estimate missing links between entities by reusing memorized knowledge from pre-training without inference. This part is problematic because building KGC models aims to infer unseen links between entities. However, conventional evaluations in KGC do not consider inference and memorization abilities separately. Thus, a PLM-based KGC method, which achieves high performance in current KGC evaluations, may be ineffective in practical applications. To address this issue, we analyze whether PLM-based KGC methods make inferences or merely access memorized knowledge. For this purpose, we propose a method for constructing synthetic datasets specified in this analysis and conclude that PLMs acquire the inference abilities required for KGC through pre-training, even though the performance improvements mostly come from textual information of entities and relations. © 2024 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1948 - Sakhovskiy 2024
TextGraphs 2024 Shared Task on Text-Graph Representations for Knowledge Graph Question Answering

Sakhovskiy, A.; Salnikov, M.; Nikishina, I.; Usmanova, A.; Kraft, A.; Möller, C.; Banerjee, D.; Huang, J.; Jiang, L.; Abdullah, R.; Yan, X.; Ustalov, D.; Tutubalina, E.; Usbeck, R.; Panchenko, A.

TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():116-125

Association for Computational Linguistics (ACL) 2024

Ref ID: 4264

This paper describes the results of the Knowledge Graph Question Answering (KGQA) shared task that was co-located with the TextGraphs 2024 workshop.1 In this task, given a textual question and a list of entities with the corresponding KG subgraphs, the participating system should choose the entity that correctly answers the question. Our competition attracted thirty teams, four of which outperformed our strong ChatGPT-based zero-shot baseline. In this paper, we overview the participating systems and analyze their performance according to a large-scale automatic evaluation. To the best of our knowledge, this is the first competition aimed at the KGQA problem using the interaction between large language models (LLMs) and knowledge graphs. © 2024 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3514 - Salinas 2023
"Im not Racist but...": Discovering Bias in the Internal Knowledge of Large Language Models

Salinas, Abel; Penafiel, Louis; McCormack, Robert; Morstatter, Fred

arXiv 2023;():

2023

Ref ID: 7892

Large language models (LLMs) have garnered significant attention for their remarkable performance in a continuously expanding set of natural language processing tasks. However, these models have been shown to harbor inherent societal biases, or stereotypes, which can adversely affect their performance in their many downstream applications. In this paper, we introduce a novel, purely prompt-based approach to uncover hidden stereotypes within any arbitrary LLM. Our approach dynamically generates a knowledge representation of internal stereotypes, enabling the identification of biases encoded within the LLM's internal knowledge. By illuminating the biases present in LLMs and offering a systematic methodology for their analysis, our work contributes to advancing transparency and promoting fairness in natural language processing systems.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#2866 - Sang 2022
A Scalable Embedding Based Neural Network Method for Discovering Knowledge From Biomedical Literature

Sang, S.; Liu, X.; Chen, X.; Zhao, D.

IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022;19(3):1294-1301

2022

DOI: 10.1109/TCBB.2020.3003947 · Ref ID: 6033

Nowadays, the amount of biomedical literatures is growing at an explosive speed, and much useful knowledge is yet undiscovered in the literature. Classical information retrieval techniques allow to access explicit information from a given collection of information, but are not able to recognize implicit connections. Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting literature. It could significantly support scientific research by identifying new connections between biomedical entities. However, most of the existing approaches to LBD are not scalable and may not be sufficient to detect complex associations in non-directly-connected literature. In this article, we present a model which incorporates biomedical knowledge graph, graph embedding, and deep learning methods for literature-based discovery. First, the relations between biomedical entities are extracted from biomedical abstracts and then a knowledge graph is constructed by using these obtained relations. Second, the graph embedding technologies are applied to convert the entities and relations in the knowledge graph into a low-dimensional vector space. Third, a bidirectional Long Short-Term Memory (BLSTM) network is trained based on the entity associations represented by the pre-trained graph embeddings. Finally, the learned model is used for open and closed literature-based discovery tasks. The experimental results show that our method could not only effectively discover hidden associations between entities, but also reveal the corresponding mechanism of interactions. It suggests that incorporating knowledge graph and deep learning methods is an effective way for capturing the underlying complex associations between entities hidden in the literature.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3571 - Sanmartin 2024
KG-RAG: Bridging the Gap Between Knowledge and Creativity

Sanmartin, Diego

arXiv 2024;():

2024

Ref ID: 8298

Ensuring factual accuracy while maintaining the creative capabilities of Large Language Model Agents (LMAs) poses significant challenges in the development of intelligent agent systems. LMAs face prevalent issues such as information hallucinations, catastrophic forgetting, and limitations in processing long contexts when dealing with knowledge-intensive tasks. This paper introduces a KG-RAG (Knowledge Graph-Retrieval Augmented Generation) pipeline, a novel framework designed to enhance the knowledge capabilities of LMAs by integrating structured Knowledge Graphs (KGs) with the functionalities of LLMs, thereby significantly reducing the reliance on the latent knowledge of LLMs. The KG-RAG pipeline constructs a KG from unstructured text and then performs information retrieval over the newly created graph to perform KGQA (Knowledge Graph Question Answering). The retrieval methodology leverages a novel algorithm called Chain of Explorations (CoE) which benefits from LLMs reasoning to explore nodes and relationships within the KG sequentially. Preliminary experiments on the ComplexWebQuestions dataset demonstrate notable improvements in the reduction of hallucinated content and suggest a promising path toward developing intelligent systems adept at handling knowledge-intensive tasks.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3835 - Sannidhi 2024
Retrieval-Augmented Generation Meets Data-Driven Tabula Rasa Approach for Temporal Knowledge Graph Forecasting

Sannidhi, Geethan; Sakhinana, Sagar Srinivas; Runkana, Venkataramana

arXiv 2024;():

2024

Ref ID: 8556

Pre-trained large language models (PLLMs) like OpenAI ChatGPT and Google Gemini face challenges such as inaccurate factual recall, hallucinations, biases, and future data leakage for temporal Knowledge Graph (tKG) forecasting. To address these issues, we introduce sLA-tKGF (small-scale language assistant for tKG forecasting), which utilizes Retrieval-Augmented Generation (RAG) aided, custom-trained small-scale language models through a tabula rasa approach from scratch for effective tKG forecasting. Our framework constructs knowledge-infused prompts with relevant historical data from tKGs, web search results, and PLLMs-generated textual descriptions to understand historical entity relationships prior to the target time. It leverages these external knowledge-infused prompts for deeper understanding and reasoning of context-specific semantic and temporal information to zero-shot prompt small-scale language models for more accurate predictions of future events within tKGs. It reduces hallucinations and mitigates distributional shift challenges through comprehending changing trends over time. As a result, it enables more accurate and contextually grounded forecasts of future events while minimizing computational demands. Rigorous empirical studies demonstrate our framework robustness, scalability, and state-of-the-art (SOTA) performance on benchmark datasets with interpretable and trustworthy tKG forecasting.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3484 - Sansford 2024
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework

Sansford, Hannah; Richardson, Nicholas; Maretic, Hermina Petric; Saada, Juba Nait

arXiv 2024;():

2024

Ref ID: 8460

Methods to evaluate Large Language Model (LLM) responses and detect inconsistencies, also known as hallucinations, with respect to the provided knowledge, are becoming increasingly important for LLM applications. Current metrics fall short in their ability to provide explainable decisions, systematically check all pieces of information in the response, and are often too computationally expensive to be used in practice. We present GraphEval: a hallucination evaluation framework based on representing information in Knowledge Graph (KG) structures. Our method identifies the specific triples in the KG that are prone to hallucinations and hence provides more insight into where in the response a hallucination has occurred, if at all, than previous methods. Furthermore, using our approach in conjunction with state-of-the-art natural language inference (NLI) models leads to an improvement in balanced accuracy on various hallucination benchmarks, compared to using the raw NLI models. Lastly, we explore the use of GraphEval for hallucination correction by leveraging the structure of the KG, a method we name GraphCorrect, and demonstrate that the majority of hallucinations can indeed be rectified.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#3506 - Sarmah 2024
HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction

Sarmah, Bhaskarjit; Hall, Benika; Rao, Rohan; Patel, Sunil; Pasquali, Stefano; Mehta, Dhagash

arXiv 2024;():

2024

Ref ID: 8522

Extraction and interpretation of intricate information from unstructured text data arising in financial applications, such as earnings call transcripts, present substantial challenges to large language models (LLMs) even using the current best practices to use Retrieval Augmented Generation (RAG) (referred to as VectorRAG techniques which utilize vector databases for information retrieval) due to challenges such as domain specific terminology and complex formats of the documents. We introduce a novel approach based on a combination, called HybridRAG, of the Knowledge Graphs (KGs) based RAG techniques (called GraphRAG) and VectorRAG techniques to enhance question-answer (Q&amp;A) systems for information extraction from financial documents that is shown to be capable of generating accurate and contextually relevant answers. Using experiments on a set of financial earning call transcripts documents which come in the form of Q&amp;A format, and hence provide a natural set of pairs of ground-truth Q&amp;As, we show that HybridRAG which retrieves context from both vector database and KG outperforms both traditional VectorRAG and GraphRAG individually when evaluated at both the retrieval and generation stages in terms of retrieval accuracy and answer generation. The proposed technique has applications beyond the financial domain

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3933 - Sarto 2024
Towards Retrieval-Augmented Architectures for Image Captioning

Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Nicolosi, Alessandro; Cucchiara, Rita

arXiv 2024;():

2024

Ref ID: 8304

The objective of image captioning models is to bridge the gap between the visual and linguistic modalities by generating natural language descriptions that accurately reflect the content of input images. In recent years, researchers have leveraged deep learning-based models and made advances in the extraction of visual features and the design of multimodal connections to tackle this task. This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process. Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities, a differentiable encoder to represent input images, and a kNN-augmented language model to predict tokens based on contextual cues and text retrieved from the external memory. We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions, especially with a larger retrieval corpus. This work provides valuable insights into retrieval-augmented captioning models and opens up new avenues for improving image captioning at a larger scale.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1575 - Sawant 2013
Learning joint query interpretation and response ranking

Sawant, U.; Chakrabarti, S.

WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web 2013;():1099-1109

Association for Computing Machinery 2013

DOI: 10.1145/2488388.2488484 · Ref ID: 5839

Thanks to information extraction and semantic Web efforts, search on unstructured text is increasingly refined using semantic annotations and structured knowledge bases. However, most users cannot become familiar with the schema of knowledge bases and ask structured queries. Interpreting free-format queries into a more structured representation is of much current interest. The dominant paradigm is to segment or partition query tokens by purpose (references to types, entities, attribute names, attribute values, relations) and then launch the interpreted query on structured knowledge bases. Given that structured knowledge extraction is never complete, here we choose a less trodden path: a data representation that retains the unstructured text corpus, along with structured annotations (mentions of entities and relationships) on it. We propose two new, natural formulations for joint query interpretation and response ranking that exploit bidirectional flow of information between the knowledge base and the corpus. One, inspired by probabilistic language models, computes expected response scores over the uncertainties of query interpretation. The other is based on max-margin discriminative learning, with latent variables representing those uncertainties. In the context of typed entity search, both formulations bridge a considerable part of the accuracy gap between a generic query that does not constrain the type at all, and the upper bound where the "perfect" target entity type of each query is provided by humans. Our formulations are also superior to a two-stage approach of first choosing a target type using recent query type prediction techniques, and then launching a type-restricted entity search query. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1182 - Sawczyn 2024
Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction

Sawczyn, A.; Viarenich, K.; Wojtasik, K.; Domogala, A.; Oleksy, M.; Piasecki, M.; Kajdanowicz, T.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():10978-10996

Association for Computational Linguistics (ACL) 2024

Ref ID: 4280

Advancements in AI and natural language processing have revolutionized machine-human language interactions, with question answering (QA) systems playing a pivotal role. The knowledge base question answering (KBQA) task, utilizing structured knowledge graphs (KG), allows for handling extensive knowledge-intensive questions. However, a significant gap exists in KBQA datasets, especially for low-resource languages. Many existing construction pipelines for these datasets are outdated and inefficient in human labor, and modern assisting tools like Large Language Models (LLM) are not utilized to reduce the workload. To address this, we have designed and implemented a modern, semi-automated approach for creating datasets, encompassing tasks such as KBQA, Machine Reading Comprehension (MRC), and Information Retrieval (IR), tailored explicitly for low-resource environments. We executed this pipeline and introduced the PUGG dataset, the first Polish KBQA dataset, and novel datasets for MRC and IR. Additionally, we provide a comprehensive implementation, insightful findings, detailed statistics, and evaluation of baseline models. © 2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3795 - Scheerer 2024
QirK: Question Answering via Intermediate Representation on Knowledge Graphs

Scheerer, Jan Luca; Lykov, Anton; Kayali, Moe; Fountalis, Ilias; Olteanu, Dan; Vasiloglou, Nikolaos; Suciu, Dan

arXiv 2024;():

2024

Ref ID: 8531

We demonstrate QirK, a system for answering natural language questions on Knowledge Graphs (KG). QirK can answer structurally complex questions that are still beyond the reach of emerging Large Language Models (LLMs). It does so using a unique combination of database technology, LLMs, and semantic search over vector embeddings. The glue for these components is an intermediate representation (IR). The input question is mapped to IR using LLMs, which is then repaired into a valid relational database query with the aid of a semantic search on vector embeddings. This allows a practical synthesis of LLM capabilities and KG reliability. A short video demonstrating QirK is available at https://youtu.be/6c81BLmOZ0U.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#3235 - Schneider 2024
Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

Schneider, Phillip; Machner, Nektarios; Jokinen, Kristiina; Matthes, Florian

arXiv 2024;():

2024

Ref ID: 8501

Knowledge models are fundamental to dialogue systems for enabling conversational interactions, which require handling domain-specific knowledge. Ensuring effective communication in information-providing conversations entails aligning user understanding with the knowledge available to the system. However, dialogue systems often face challenges arising from semantic inconsistencies in how information is expressed in natural language compared to how it is represented within the system's internal knowledge. To address this problem, we study the potential of large language models for conversational grounding, a mechanism to bridge information gaps by establishing shared knowledge between dialogue participants. Our approach involves annotating human conversations across five knowledge domains to create a new dialogue corpus called BridgeKG. Through a series of experiments on this dataset, we empirically evaluate the capabilities of large language models in classifying grounding acts and identifying grounded information items within a knowledge graph structure. Our findings offer insights into how these models use in-context learning for conversational grounding tasks and common prediction errors, which we illustrate with examples from challenging dialogues. We discuss how the models handle knowledge graphs as a semantic layer between unstructured dialogue utterances and structured information items.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2733 - Schoch 2024
NL2IBE – Ontology-controlled Transformation of Natural Language into Formalized Engineering Artefacts

Schoch, N.; Hoernicke, M.

2024 IEEE Conference on Artificial Intelligence (CAI) 2024;():997-1004

2024

DOI: 10.1109/CAI59869.2024.00182 · Ref ID: 6542

Looking at Process and Automation Engineering (P&AE) today, for the technically adept engineer, there are many different tools available to support the engineering work from translation of engineering intentions into module and plant descriptions, to definition and parametrization of entire process plant setups, for export to a control system. However, still today, in the very early engineering phases, engineering intentions either need to be entered already in a structured and controlled expert language or require a human expert’s manual efforts for translation from unstructured language into formalized representations, in order for thereon-based consistent further processing in the existing tools. This process is time-consuming, fuzzy, and error-prone due to potential misconceptions and ambiguities, even for domain experts. In this work, we therefore present our NL2IBE Tool, which makes use of modern Natural Language Processing in combination with Ontology Mining, and which, based on and controlled by an underlying ontology, allows for the deterministic transformation of natural language intentions into structured and consistent engineering artefacts. We describe the overall tool architecture as well as crucial functionalities and implementation features, followed by an evaluation by the example of a hydrogen generation and CCSU use case. We conclude with a discussion of the proposed tool and give an outlook on future research. (Abstract)

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2824 - Schorlemmer 2011
Reasoning about Distributed Knowledge-Transforming Peer Interactions

Schorlemmer, M.; Robertson, D.

IEEE Transactions on Knowledge and Data Engineering 2011;23(9):1419-1431

2011

DOI: 10.1109/TKDE.2010.265 · Ref ID: 6028

We address the problem of how to reason about properties of knowledge transformations as they occur in distributed and decentralized interactions between large and complex artifacts, such as databases, web services, and ontologies. Based on the conceptual distinction between specifications of interactions and properties of knowledge transformations that follow from these interactions, we explore a novel mixture of process calculus and property inference by connecting interaction models with knowledge transformation rules. We aim at being generic in our exploration, hence our emphasis on abstract knowledge transformations, although we exemplify it using a lightweight specification language for interaction modeling (for which an executable peer-to-peer environment already exists) and provide a formal semantics for knowledge transformation rules using the theory of institutions. Consequently, our exploration is also an example of the gain obtained by linking current state-of-the-art distributed knowledge engineering based on web services and peer-based architectures with formal methods drawn from a long tradition in algebraic specification.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3924 - Sengupta 2024
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings

Sengupta, Saptarshi; Heaton, Connor; Cui, Suhan; Sarkar, Soumalya; Mitra, Prasenjit

arXiv 2024;():

2024

Ref ID: 8035

In Natural Language Processing (NLP), Machine Reading Comprehension (MRC) is the task of answering a question based on a given context. To handle questions in the medical domain, modern language models such as BioBERT, SciBERT and even ChatGPT are trained on vast amounts of in-domain medical corpora. However, in-domain pre-training is expensive in terms of time and resources. In this paper, we propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training. Knowledge graphs are powerful resources for accessing medical information. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from medical knowledge graphs with the embedding spaces of pre-trained language models (LMs). The aligned embeddings are fused with open-domain LMs BERT and RoBERTa that are fine-tuned for two MRC tasks, span detection (COVID-QA) and multiple-choice questions (PubMedQA). We compare our method to prior techniques that rely on a vocabulary overlap for embedding alignment and show how our method circumvents this requirement to deliver better performance. On both datasets, our method allows BERT/RoBERTa to either perform on par (occasionally exceeding) with stronger domain-specific models or show improvements in general over prior techniques. With the proposed approach, we signal an alternative method to in-domain pre-training to achieve domain proficiency.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#661 - Shang 2019
Pre-training of Graph Augmented Transformers for Medication Recommendation

Shang, J. Y.; Ma, T. F.; Xiao, C.; Sun, J. M.

28th International Joint Conference on Artificial Intelligence 2019;():5953-5959

Macao, PEOPLES R CHINA Ijcai-Int Joint Conf Artif Intell 2019

Ref ID: 3686

Medication recommendation is an important healthcare application. It is commonly formulated as a temporal prediction task. Hence, most existing works only utilize longitudinal electronic health records (EHRs) from a small number of patients with multiple visits ignoring a large number of patients with a single visit (selection bias). Moreover, important hierarchical knowledge such as diagnosis hierarchy is not leveraged in the representation learning process. To address these challenges, we propose G-BERT, a new model to combine the power of Graph Neural Networks (GNNs) and BERT (Bidirectional Encoder Representations from Transformers) for medical code representation and medication recommendation. We use GNNs to represent the internal hierarchical structures of medical codes. Then we integrate the GNN representation into a transformer-based visit encoder and pre-train it on EHR data from patients only with a single visit. The pre-trained visit encoder and representation are then fine-tuned for downstream predictive tasks on longitudinal EHRs from patients with multiple visits. G-BERT is the first to bring the language model pre-training schema into the healthcare domain and it achieved state-of-the-art performance on the medication recommendation task.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1880 - Shang 2022
Sequential Semantic Knowledge Graph Embedding

Shang, Y. M.; Huang, H.; Yuan, Y.

Lecture Notes in Electrical Engineering 2022;861 LNEE():1547-1557

Springer Science and Business Media Deutschland GmbH 2022

DOI: 10.1007/978-981-16-9492-9_153 · Ref ID: 5532

Knowledge graph embedding is aimed at representing entities and relations of knowledge graph in a low-dimensional continuous vector space. Previous embedding models pay little attention to the sequential semantic information in triples and as a result, may lead to the semantic drift problem. Towards this end, we propose a novel sequential semantic embedding (SeqSemE) model to address this problem in this paper. Firstly, we utilize a sequential language model to capture sequential information of triples and interactions between entities and relations. Secondly, we propose a method of learning two embeddings for each relation to avoid semantic drift. Extensive experiments on link prediction show that our SeqSemE is efficient and effective. It can obtain better performance than previous state-of-the-art embedding models. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#214 - Shao 2024
Enhancing Bug Report Summaries Through Knowledge-Specific and Contrastive Learning Pre-Training

Shao, Y. N.; Xiang, B. M.

IEEE Access 2024;12():37653-37662

2024

DOI: 10.1109/access.2024.3368915 · Ref ID: 3767

Bug reports are crucial in software maintenance, with concise summaries significantly enhancing the efficiency of bug triagers and ultimately contributing to the development of high-quality software products. Contemporary methods for automatic bug report summarization primarily utilize neural networks' robust learning capabilities. However, these approaches often produce suboptimal summaries due to two primary limitations: 1) the difficulty in assimilating the domain-specific knowledge inherent in bug reports, and 2) the limitations of purely supervised learning in comprehending the comprehensive context of bug reports. To address the above two problems, in this paper, we propose a new approach for bug report summarization, namely KSCLP, which leverages large language models and domain-specific pre-training strategies, i.e., Knowledge-Specific and Contrastive Learning Pre-training. Specifically, the Knowledge-Specific strategy allows to pre-train KSCLP on project-specific bug reports corpus, by which the model can fully learn internal knowledge of bug reports, learning bug report-aware representation. As for the Contrastive Learning strategy, it performs a sequence-level pre-training for KSCLP, helping it capture the semantic information of bug reports on a global level. Upon completion of the pre-training phase, KSCLP undergoes further refinement through a Sequence-to-Sequence framework specifically tailored for bug report summarization. The efficacy of KSCLP is rigorously evaluated against five baseline models using a publicly available dataset. The empirical results demonstrate that KSCLP outperforms all baselines, achieving remarkable improvements by up to 23.73, 13.97, and 20.89 points in ROUGE-1, ROUGE-2, and ROUGE-L metrics, thereby setting new benchmarks in the field of bug report summarization.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1700 - Shao 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL

Shao, Y.; Nakashole, N.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():131-156

Association for Computational Linguistics (ACL) 2024

Ref ID: 4440

Structured data, prevalent in tables, databases, and knowledge graphs, poses a significant challenge in its representation. With the advent of large language models (LLMs), there has been a shift towards linearization-based methods, which process structured data as sequential token streams, diverging from approaches that explicitly model structure, often as a graph. Crucially, there remains a gap in our understanding of how these linearization-based methods handle structured data, which is inherently non-linear. This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5. Our findings reveal the model’s ability to mimic human-designed processes such as schema linking and syntax prediction, indicating a deep, meaningful learning of structure beyond simple token sequencing. We also uncover insights into the model’s internal mechanisms, including the ego-centric nature of structure node encodings and the potential for model compression due to modality fusion redundancy. Overall, this work sheds light on the inner workings of linearization-based methods and could potentially provide guidance for future research. © 2024 Association for Computational Linguistics.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#3512 - Shapurian 2023
Identifying Planetary Names in Astronomy Papers: A Multi-Step Approach

Shapurian, Golnaz; Kurtz, Michael J.; Accomazzi, Alberto

arXiv 2023;():

2023

Ref ID: 7985

The automatic identification of planetary feature names in astronomy publications presents numerous challenges. These features include craters, defined as roughly circular depressions resulting from impact or volcanic activity; dorsas, which are elongate raised structures or wrinkle ridges; and lacus, small irregular patches of dark, smooth material on the Moon, referred to as "lake" (Planetary Names Working Group, n.d.). Many feature names overlap with places or people's names that they are named after, for example, Syria, Tempe, Einstein, and Sagan, to name a few (U.S. Geological Survey, n.d.). Some feature names have been used in many contexts, for instance, Apollo, which can refer to mission, program, sample, astronaut, seismic, seismometers, core, era, data, collection, instrument, and station, in addition to the crater on the Moon. Some feature names can appear in the text as adjectives, like the lunar craters Black, Green, and White. Some feature names in other contexts serve as directions, like craters West and South on the Moon. Additionally, some features share identical names across different celestial bodies, requiring disambiguation, such as the Adams crater, which exists on both the Moon and Mars. We present a multi-step pipeline combining rule-based filtering, statistical relevance analysis, part-of-speech (POS) tagging, named entity recognition (NER) model, hybrid keyword harvesting, knowledge graph (KG) matching, and inference with a locally installed large language model (LLM) to reliably identify planetary names despite these challenges. When evaluated on a dataset of astronomy papers from the Astrophysics Data System (ADS), this methodology achieves an F1-score over 0.97 in disambiguating planetary feature names.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#359 - Sharifzadeh 2022
Improving Scene Graph Classification by Exploiting Knowledge from Texts

Sharifzadeh, S.; Baharlou, S. M.; Schmitt, M.; Schütze, H.; Tresp, V.; Assoc Advancement Artificial, Intelligence

36th AAAI Conference on Artificial Intelligence / 34th Conference on Innovative Applications of Artificial Intelligence / 12th Symposium on Educational Advances in Artificial Intelligence 2022;():2189-2197

Electr Network Assoc Advancement Artificial Intelligence 2022

Ref ID: 3339

Training scene graph classification models requires a large amount of annotated image data. Meanwhile, scene graphs represent relational knowledge that can be modeled with symbolic data from texts or knowledge graphs. While image annotation demands extensive labor, collecting textual descriptions of natural scenes requires less effort. In this work, we investigate whether textual scene descriptions can substitute for annotated image data. To this end, we employ a scene graph classification framework that is trained not only from annotated images but also from symbolic data. In our architecture, the symbolic entities are first mapped to their correspondent image-grounded representations and then fed into the relational reasoning pipeline. Even though a structured form of knowledge, such as the form in knowledge graphs, is not always available, we can generate it from unstructured texts using a transformer-based language model. We show that by fine-tuning the classification pipeline with the extracted knowledge from texts, we can achieve similar to 8x more accurate results in scene graph classification, similar to 3x in object classification, and similar to 1.5x in predicate classification, compared to the supervised baselines with only 1% of the annotated images.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#806 - Sharma 2021
T<SUP>3</SUP>: Domain-Agnostic Neural Time-series Narration

Sharma, M.; Brownstein, J. S.; Ramakrishnan, N.

21st IEEE International Conference on Data Mining (IEEE ICDM) 2021;():1324-1329

Electr Network Ieee Computer Soc 2021

DOI: 10.1109/icdm51629.2021.00165 · Ref ID: 3399

The task of generating rich and fluent narratives that aptly describe the characteristics, trends, and anomalies of time-series data is invaluable to the sciences (geology, meteorology, epidemiology) or finance (trades, stocks). The efforts for time-series narration hitherto are domain-specific and use predefined templates that offer consistency but lead to mechanical narratives. We present T-3 (Time-series-To-Text), a domain-agnostic neural framework for time-series narration, that couples the representation of essential time-series elements in the form of a dense knowledge graph and the translation of said knowledge graph into rich and fluent narratives through the transfer-learning capabilities of PLMs (Pre-trained Language Models). To the best of our knowledge, T-3 is the first investigation of the use of neural strategies for time-series narration. We showcase that T-3 can improve the lexical diversity of the generated narratives by up to 65.38% while still maintaining grammatical integrity. The performance and practicality of T-3 is further validated through an expert review (n = 21) where 76.2% of participating experts wary of auto-generated narratives favored T-3 as a deployable system for time-series narration due to its rich and diverse narratives. Our code-base and the datasets used with detailed instructions for reproducibility is publicly hosted(1).

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3905 - Sharma 2021
TCube: Domain-Agnostic Neural Time-series Narration

Sharma, Mandar; Brownstein, John S.; Ramakrishnan, Naren

arXiv 2021;():

2021

Ref ID: 7488

The task of generating rich and fluent narratives that aptly describe the characteristics, trends, and anomalies of time-series data is invaluable to the sciences (geology, meteorology, epidemiology) or finance (trades, stocks, or sales and inventory). The efforts for time-series narration hitherto are domain-specific and use predefined templates that offer consistency but lead to mechanical narratives. We present TCube (Time-series-to-text), a domain-agnostic neural framework for time-series narration, that couples the representation of essential time-series elements in the form of a dense knowledge graph and the translation of said knowledge graph into rich and fluent narratives through the transfer-learning capabilities of PLMs (Pre-trained Language Models). TCube's design primarily addresses the challenge that lies in building a neural framework in the complete paucity of annotated training data for time-series. The design incorporates knowledge graphs as an intermediary for the representation of essential time-series elements which can be linearized for textual translation. To the best of our knowledge, TCube is the first investigation of the use of neural strategies for time-series narration. Through extensive evaluations, we show that TCube can improve the lexical diversity of the generated narratives by up to 65.38% while still maintaining grammatical integrity. The practicality and deployability of TCube is further validated through an expert review (n=21) where 76.2% of participating experts wary of auto-generated narratives favored TCube as a deployable system for time-series narration due to its richer narratives. Our code-base, models, and datasets, with detailed instructions for reproducibility is publicly hosted at https://github.com/Mandar-Sharma/TCube.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1303 - Shcherbakov 2020
Exploring Looping Effects in RNN-based Architectures

Shcherbakov, A.; Muradoğlu, S.; Vylomova, E.

Proceedings of the Australasian Language Technology Workshop 2020;18():6

Australasian Language Technology Association 2020

Ref ID: 5633

The paper investigates repetitive loops, a common problem in contemporary text generation (such as machine translation, language modelling, morphological inflection) systems. We hypothesized that a model’s failure to distinguish respective latent states for different positions in an output sequence may be the primary cause of the looping. Therefore, we propose adding a position-aware discriminating factor to the model in attempt to reduce that effect. We conduct a study on neural models with recurrent units by explicitly altering their decoder internal state. We use a task of morphological reinflection as a proxy to study the effects of the changes. Our results show that the probability of the occurrence of repetitive loops is significantly reduced by introduction of an extra neural decoder output. The output should be specifically trained to produce gradually increasing value upon generation of each character of a given sequence. We also explored variations of the technique and found that feeding the extra output back to the decoder amplifies the positive effects. © 2020, Australasian Language Technology Association. All rights reserved.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#3556 - Shen 2024
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models

Shen, Guobin; Zhao, Dongcheng; Dong, Yiting; He, Xiang; Zeng, Yi

arXiv 2024;():

2024

Ref ID: 8652

As large language models (LLMs) become integral to various applications, ensuring both their safety and utility is paramount. Jailbreak attacks, which manipulate LLMs into generating harmful content, pose significant challenges to this balance. Existing defenses, such as prompt engineering and safety fine-tuning, often introduce computational overhead, increase inference latency, and lack runtime flexibility. Moreover, overly restrictive safety measures can degrade model utility by causing refusals of benign queries. In this paper, we introduce Jailbreak Antidote, a method that enables real-time adjustment of LLM safety preferences by manipulating a sparse subset of the model's internal states during inference. By shifting the model's hidden representations along a safety direction with varying strengths, we achieve flexible control over the safety-utility balance without additional token overhead or inference delays. Our analysis reveals that safety-related information in LLMs is sparsely distributed; adjusting approximately 5% of the internal state is as effective as modifying the entire state. Extensive experiments on nine LLMs (ranging from 2 billion to 72 billion parameters), evaluated against ten jailbreak attack methods and compared with six defense strategies, validate the effectiveness and efficiency of our approach. By directly manipulating internal states during reasoning, Jailbreak Antidote offers a lightweight, scalable solution that enhances LLM safety while preserving utility, opening new possibilities for real-time safety mechanisms in widely-deployed AI systems.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#252 - Shen 2020
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning

Shen, T.; Mao, Y.; He, P. C.; Long, G. D.; Trischler, A.; Chen, W. Z.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():8980-8994

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 2978

In this work, we aim at equipping pre-trained language models with structured knowledge. We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme that exploits relational knowledge underlying the text. This is fulfilled by using a linked knowledge graph to select informative entities and then masking their mentions. In addition, we use knowledge graphs to obtain distractors for the masked entities, and propose a novel distractor-suppressed ranking objective that is optimized jointly with masked language model. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training, to inject language models with structured knowledge via learning from raw text. It is more efficient than retrieval-based methods that perform entity linking and integration during finetuning and inference, and generalizes more effectively than the methods that directly learn from concatenated graph triples. Experiments show that our proposed model achieves improved performance on five benchmarks, including question answering and knowledge base completion.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#2277 - Shen 2024
Construction of Knowledge Graph of Judicial Case Based on LLMs and Embedding Models

Shen, Y.

2024 IEEE 2nd International Conference on Sensors, Electronics and Computer Engineering (ICSECE) 2024;():949-955

2024

DOI: 10.1109/ICSECE61636.2024.10729603 · Ref ID: 7075

This paper constructs the Judicial Case Knowledge Graph (JCKG) dataset based on China Judicial Judgements Online, which fills the gap of knowledge graph research in the legal field. First, we systematically collate the relevant data of Judicial Judgements Online, and then use advanced Large-scale Language Models (LLMs) technology to extract the entities and relationships in the data. Then, through the link prediction experiment, the mainstream knowledge graph learning data set was compared horizontally, and the embedding models such as TransE, ConvE, and pRotatE were compared vertically to verify the validity of JCKG data set. Experiments show that JCKG data set can effectively implement multi-hop inference. Legal Knowledge Graph plays an important role in the intelligent judicial system, which improves the efficiency of case processing and the accuracy of judicial decision-making and reduces the risk of errors. This research provides strong support for the field of legal artificial intelligence and promotes the development of the judicial system in the direction of more efficient and intelligent.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1003 - Sheng 2023
An Augmentable Domain-specific Models for Financial Analysis

Sheng, J.

Proceedings - 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2023 2023;():

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/CISP-BMEI60920.2023.10373245 · Ref ID: 5012

Large-scale language models such as GPT-4 have revolutionized data analysis and interpretation by generating human-like text, automating insights, and detecting data errors. Large-scale language models have been applied in various fields and played an important role in many aspects. Large language models can also perform financial and technical analysis by cleaning data, generating synthetic data, handling bias, and supporting natural language queries. This paper proposes a language model that integrates multimodal data with external knowledge bases and domain-specific data, enhancing its reasoning ability by extending domain-specific data. Reduce hallucinations and fine-tune domain-specific data by incorporating external knowledge bases to deepen model understanding of industry-specific language, concepts, and context. And technologies such as knowledge graph, attention mechanism, cross-modal embedding and federated collaborative training are used to deal with the challenges of different structures and semantics of multi-modal data. The model also employs a feedback loop mechanism to allow the model to adapt to changing conditions, such as changing languages or new domain information. Experimental results show that the proposed model has a preliminary domain-specific ability to analyze and predict multimodal financial and technical data. © 2023 IEEE.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3669 - Shi 2024
LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning

Shi, Guangsi; Deng, Xiaofeng; Luo, Linhao; Xia, Lijuan; Bao, Lei; Ye, Bei; Du, Fei; Pan, Shirui; Li, Yuxiao

arXiv 2024;():

2024

Ref ID: 8413

Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable recommender system is crucial for the product development and subsequent decision-making. To address these challenges, we introduce a novel recommender that synergies Large Language Models (LLMs) and KGs to enhance the recommendation and provide interpretable results. Specifically, we first harness the power of LLMs to augment KG reconstruction. LLMs comprehend and decompose user reviews into new triples that are added into KG. In this way, we can enrich KGs with explainable paths that express user preferences. To enhance the recommendation on augmented KGs, we introduce a novel subgraph reasoning module that effectively measures the importance of nodes and discovers reasoning for recommendation. Finally, these reasoning paths are fed into the LLMs to generate interpretable explanations of the recommendation results. Our approach significantly enhances both the effectiveness and interpretability of recommender systems, especially in cross-selling scenarios where traditional methods falter. The effectiveness of our approach has been rigorously tested on four open real-world datasets, with our methods demonstrating a superior performance over contemporary state-of-the-art techniques by an average improvement of 12%. The application of our model in a multinational engineering and technology company cross-selling recommendation system further underscores its practical utility and potential to redefine recommendation practices through improved accuracy and user trust.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#515 - Shi 2024
Legal-LM: Knowledge Graph Enhanced Large Language Models for Law Consulting

Shi, J. M.; Guo, Q. L.; Liao, Y.; Wang, Y. X.; Chen, S. J.; Liang, S. L.

20th International Conference on Intelligent Computing (ICIC) 2024;14878():175-186

Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024

DOI: 10.1007/978-981-97-5672-8_15 · Ref ID: 2956

This paper introduces Legal-LM, an advanced Large Language Model (LLM) enhanced with a Knowledge Graph, specifically designed for legal consulting in the Chinese legal domain. Addressing the challenges of domain-specific adaptation, data veracity, and consultations with non-professional users in legal-AI, Legal-LM incorporates extensive legal corpora and a knowledge graph for effective legal knowledge acquisition. The model utilizes techniques such as external legal knowledge basis, soft prompts, and Direct Preference Optimization (DPO) to ensure accurate and diverse legal advice. Our experimental results demonstrate that Legal-LM exhibits superior performance over existing models in legal question answering, case analysis, and legal recommendations, these show its potential to facilitate legal consulting and education.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3727 - Shi 2024
MUSE: Machine Unlearning Six-Way Evaluation for Language Models

Shi, Weijia; Lee, Jaechan; Huang, Yangsibo; Malladi, Sadhika; Zhao, Jieyu; Holtzman, Ari; Liu, Daogao; Zettlemoyer, Luke; Smith, Noah A.; Zhang, Chiyuan

arXiv 2024;():

2024

Ref ID: 8451

Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content. Data owners may request the removal of their data from a trained model due to privacy or copyright concerns. However, exactly unlearning only these datapoints (i.e., retraining with the data removed) is intractable in modern-day models. This has led to the development of many approximate unlearning algorithms. The evaluation of the efficacy of these algorithms has traditionally been narrow in scope, failing to precisely quantify the success and practicality of the algorithm from the perspectives of both the model deployers and the data owners. We address this issue by proposing MUSE, a comprehensive machine unlearning evaluation benchmark that enumerates six diverse desirable properties for unlearned models: (1) no verbatim memorization, (2) no knowledge memorization, (3) no privacy leakage, (4) utility preservation on data not intended for removal, (5) scalability with respect to the size of removal requests, and (6) sustainability over sequential unlearning requests. Using these criteria, we benchmark how effectively eight popular unlearning algorithms on 7B-parameter LMs can unlearn Harry Potter books and news articles. Our results demonstrate that most algorithms can prevent verbatim memorization and knowledge memorization to varying degrees, but only one algorithm does not lead to severe privacy leakage. Furthermore, existing algorithms fail to meet deployer's expectations because they often degrade general model utility and also cannot sustainably accommodate successive unlearning requests or large-scale content removal. Our findings identify key issues with the practicality of existing unlearning algorithms on language models, and we release our benchmark to facilitate further evaluations: muse-bench.github.io

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#97 - Shi 2023
ChatGraph: Interpretable Text Classification by Converting ChatGPT Knowledge to Graphs

Shi, Y. C.; Ma, H. H.; Zhong, W. L.; Tan, Q. Y.; Mai, G. C.; Li, X.; Liu, T. M.; Huang, J. Z.

23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():515-520

Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023

DOI: 10.1109/icdmw60847.2023.00073 · Ref ID: 3456

ChatGPT, as a recently launched large language model (LLM), has shown superior performance in various natural language processing (NLP) tasks. However, two major limitations hinder its potential applications: 1) the inflexibility of finetuning on downstream tasks, and 2) the lack of interpretability in the decision-making process. To tackle these limitations, we propose a novel framework that leverages the power of ChatGPT for specific tasks, such as text classification, while improving its interpretability. The proposed framework conducts a knowledge graph extraction task to extract refined and structural knowledge from the raw data using ChatGPT. The rich knowledge is then converted into a graph, which is further used to train an interpretable linear classifier to make predictions. To evaluate the effectiveness of our proposed method, we conduct experiments on four benchmark datasets. The results demonstrate that our method can significantly improve the prediction performance compared to directly utilizing ChatGPT for text classification tasks. Furthermore, our method provides a more transparent decision-making process compared with previous text classification methods. The code is available at https://github.com/sycny/ChatGraph.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1450 - Shim 2021
A JOINT FRAMEWORK FOR DISTILLING THE EXPERTISE IN ELECTRIC POWER UTILITY DOMAIN WITH GENBERT

Shim, Y.; Lim, H.; Ha, Y.; Kim, S.; Lee, I.; Jeong, S.

IET Conference Proceedings 2021;2021():1331-1335

Institution of Engineering and Technology 2021

DOI: 10.1049/icp.2021.2165 · Ref ID: 5516

Over the past decades, extracting crucial information such as domain expertise from unstructured data has been considered a great challenge. In the domain of electric power utility, there are similar difficulties. Since the individual experiences and know-hows of electric power utility technicians are not digitized into a database but they are fragmented into so many reports and documents, it is hard to find the right information and the knowledge gap has been widened more and more between workers. Natural language processing (NLP) based on deep learning technologies is emerging as one of the most efficient ways for searching textual information and extracting valuable contexts. Using these techniques, we can make a domain-specific language model and can provide appropriate answers for users' questions. In this study, we propose a joint framework for distilling the expertise in the electric power utility domain with GenBERT. This framework consists of three sub-components: 'Pre-processing', 'Extract', and 'QA' components. To evaluate the performance of our proposed framework, we conducted various comparison experiments on 'Extract' and 'QA' components. As a result, our framework has shown improved the QA performance by answering to the electric power utility domain specific questions with higher accuracy. © 2021 The Institution of Engineering and Technology.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2431 - Sibunruang 2018
Finding Clinical Knowledge from MEDLINE Abstracts by Text Summarization Technique

Sibunruang, C.; Polpinij, J.

2018 International Conference on Information Technology (InCIT) 2018;():1-6

2018

DOI: 10.23919/INCIT.2018.8584867 · Ref ID: 6208

Today, the MEDLINE is an important repository containing more than 26 million citations and abstracts in the fields of medicine, while PubMed provides free access to MEDLINE and links to full-text articles. MEDLINE abstracts becomes a potential source of new knowledge in medical field. However, it is time-consuming and labour-intensive to find knowledge from MEDLINE abstracts, when a search returns much abstracts and each may contain a large volume of information. Therefore, this work aims to present a method of summarizing clinical knowledge from a MEDLINE abstract. The main mechanisms of the proposed method are driven on natural language processing (NLP) and text filtering techniques. The case study of this work is to summarize the clinical knowledge from a MEDLINE abstracts relating to cervical cancer in clinical trials. In the evaluation stage, the actual results obtained from a domain expert are used to compare the predicted results. After testing by recall, precision, and F-score, they return the satisfactory results, where the average of recall, precision, and F-measure are 0.84, 1.00, and 0.91 respectively.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1293 - Simon 2023
Experiments on GPT-3 Assisted Process Model Development

Simon, C.; Haag, S.; Zakfeld, L.

Proceedings - European Council for Modelling and Simulation, ECMS 2023;2023-June():270-276

European Council for Modelling and Simulation 2023

DOI: 10.7148/2023-0270 · Ref ID: 5280

Computer assisted process model development from textual descriptions is still an open research question. Advantages of such a technology lie in shorter development times and possibly a more concise interpretation of the narrative input. A solution to this problem necessarily relies on methods from formal modeling and linguistics. In the latter field, the new GPT-3 model is recognized as a breakthrough that outperforms previous technologies whose limitations hindered success of earlier research in this context. But are GPT-3’s capabilities to summarize text, detect cause-and-effect, or to classify terms sufficient to succeed? The presented research describes the results of systematic experiments to use GPT-3 to interpret a textual process description and transform it into a formal representation. The different settings demonstrate how to exploit the capabilities of large language models and how to avoid pitfalls. Although the observations made are promising, further work is needed. The outcome of this paper identifies the direction in which this future research should proceed. © ECMS Enrico Vicario, Romeo Bandinelli, Virginia Fani, Michele Mastroianni (Editors) 2023.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#790 - Sinclair 2022
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations

Sinclair, A.; Jumelet, J.; Zuidema, W.; Fernández, R.

Trans. Assoc. Comput. Linguist. 2022;10():1031-1050

2022

DOI: 10.1162/tacl_a_00504 · Ref ID: 3317

We investigate the extent to which modern neural language models are susceptible to structural priming, the phenomenon whereby the structure of a sentence makes the same structure more probable in a follow-up sentence. We explore how priming can be used to study the potential of these models to learn abstract structural information, which is a prerequisite for good performance on tasks that require natural language understanding skills. We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors that interact with priming strength. We find that Transformer models indeed show evidence of structural priming, but also that the generalizations they learned are to some extent modulated by semantic information. Our experiments also show that the representations acquired by the models may not only encode abstract sequential structure but involve certain level of hierarchical syntactic information. More generally, our study shows that the priming paradigm is a useful, additional tool for gaining insights into the capacities of language models and opens the door to future priming-based investigations that probe the model's internal states.(1)

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#96 - Skryd 2024
ChatGPT as a Tool for Medical Education and ClinicalDecision-Making on the Wards:Case Study

Skryd, A.; Lawrence, K.

JMIR Form. Res. 2024;8():8

2024

DOI: 10.2196/51346 · Ref ID: 3700

Background: Large language models (LLMs) are computational artificial intelligence systems with advanced natural languageprocessing capabilities that have recently been popularized among health care students and educators due to their ability to providereal-time access to a vast amount of medical knowledge. The adoption of LLM technology into medical education and traininghas varied, and little empirical evidence exists to support its use in clinical teaching environments. Objective: The aim of the study is to identify and qualitatively evaluate potential use cases and limitations of LLM technologyfor real-time ward-based educational contexts. Methods: A brief, single-site exploratory evaluation of the publicly available ChatGPT-3.5 (OpenAI) was conducted byimplementing the tool into the daily attending rounds of a general internal medicine inpatient service at a large urban academicmedical center. ChatGPT was integrated into rounds via both structured and organic use, using the web-based "chatbot" styleinterface to interact with the LLM through conversational free-text and discrete queries. A qualitative approach usingphenomenological inquiry was used to identify key insights related to the use of ChatGPT through analysis of ChatGPT conversationlogs and associated shorthand notes from the clinical sessions. Results: Identified use cases for ChatGPT integration included addressing medical knowledge gaps through discrete medicalknowledge inquiries, building differential diagnoses and engaging dual-process thinking, challenging medical axioms, usingcognitive aids to support acute care decision-making, and improving complex care management by facilitating conversationswith subspecialties. Potential additional uses included engaging in difficult conversations with patients, exploring ethical challengesand general medical ethics teaching, personal continuing medical education resources, developing ward-based teaching tools,supporting and automating clinical documentation, and supporting productivity and task management. LLM biases, misinformation,ethics, and health equity were identified as areas of concern and potential limitations to clinical and training use. A code of conducton ethical and appropriate use was also developed to guide team usage on the wards. Conclusions: Overall, ChatGPT offers a novel tool to enhance ward-based learning through rapid information querying,second-order content exploration, and engaged team discussion regarding generated responses. More research is needed to fullyunderstand contexts for educational use, particularly regarding the risks and limitations of the tool in clinical settings and itsimpacts on trainee development.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2206 - Smith 2014
BPAL: A tool for managing semantically enriched conceptual process models

Smith, F.; Proietti, M.

eChallenges e-2014 Conference Proceedings 2014;():1-10

2014

Ref ID: 6557

In this paper we will provide an overview of the Business Process Abstract Language (BPAL) Platform, which implements a Business Process (BP) modelling and reasoning environment where the procedural knowledge of a BP can be enriched through ontology-based annotations. The BPAL Platform provides a graphical user interface to ease the definition of a Business Process Knowledge Base that collects the various facets of process knowledge. It also provides a reasoner implementing services for the enactment, verification, retrieval, and composition of processes in the knowledge base. After discussing the functionalities and the architecture of the tool, we report on an experimental evaluation of the whole system, whose results are encouraging and show the viability of the approach.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1699 - Snyder 2024
On Early Detection of Hallucinations in Factual Question Answering

Snyder, B.; Moisescu, M.; Zafar, M. B.

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():2721-2732

Association for Computing Machinery 2024

DOI: 10.1145/3637528.3671796 · Ref ID: 3933

While large language models (LLMs) have taken great strides towards helping humans with a plethora of tasks, hallucinations remain a major impediment towards gaining user trust. The fluency and coherence of model generations even when hallucinating makes detection a difficult task. In this work, we explore if the artifacts associated with the model generations can provide hints that the generation will contain hallucinations. Specifically, we probe LLMs at 1) the inputs via Integrated Gradients based token attribution, 2) the outputs via the Softmax probabilities, and 3) the internal state via self-attention and fully-connected layer activations for signs of hallucinations on open-ended question answering tasks. Our results show that the distributions of these artifacts tend to differ between hallucinated and non-hallucinated generations. Building on this insight, we train binary classifiers that use these artifacts as input features to classify model generations into hallucinations and non-hallucinations. These hallucination classifiers achieve up to 0.80 AUROC. We also show that tokens preceding a hallucination can already predict the subsequent hallucination even before it occurs. © 2024 Copyright held by the owner/author(s).

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#306 - Song 2023
Generative Event Extraction via Internal Knowledge-Enhanced Prompt Learning

Song, H. T.; Zhu, Q. M.; Yu, Z. P.; Liang, J.; He, H.

32nd International Conference on Artificial Neural Networks (ICANN) 2023;14258():90-102

Heraklion, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-44192-9_8 · Ref ID: 3671

Event extraction is a crucial research task in information extraction. In order to maximize the performances of the pre-trained language model (PLM), some works formulating event extraction as a conditional generation problem. However, most existing generative methods ignore the prior information between event entities, and are usually over-dependent on hand-crafted designed templates, which causing subjective intervention. In this paper, we propose a generative event extraction model named KEPGEE based on internal knowledge-enhanced prompt learning. We firstly use relational graph neural networks (RGCN) to encode the event triples entities and fuse them with the word embeddings to obtain the knowledge representation. Then the knowledge representation is concatenated with task-specific virtual tokens to compose knowledge-enhanced soft prompts, which can provide additional event information to adapt the sequence-to-sequence PLM for the generative event extraction task. Besides, in template design, we add the related topic words into the prompt templates to enhance the implicit event information. We evaluate our model on ACE2005 and ERE datasets, and the results show that our model achieves matched or better performances with several classification-based or generation-based event extraction models (including the state-of-the-art models).

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1445 - Song 2024
ITAKE: Interactive Unstructured Text Annotation and Knowledge Extraction System with LLMs and ModelOps

Song, J.; Ding, H.; Wang, Z.; Xu, Y.; Zhao, J.; Wang, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;3():326-334

Association for Computational Linguistics (ACL) 2024

Ref ID: 4344

Extracting structured knowledge from unstructured text data has a wide range of application prospects, and a pervasive trend is to develop text annotation tools to help extraction. However, they often encounter issues such as single scenario usage, lack of effective human-machine collaboration, insufficient model supervision, and suboptimal utilization of Large Language Models (LLMs). We introduces an interactive unstructured text annotation and knowledge extraction system that synergistically integrates LLMs and ModelOps to alleviate these issues. The system leverages LLMs for enhanced performance in low-resource contexts, employs a ModelOps platform to monitor models throughout their lifecycle, and amalgamates interactive annotation methods with online machine learning and active learning. The demo video1 and website2 are now publicly available. © 2024 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1662 - Song 2023
Multilingual Knowledge Graph Completion from Pretrained Language Models with Knowledge Constraints

Song, R.; He, S.; Gao, S.; Cai, L.; Liu, K.; Yu, Z.; Zhao, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():7709-7721

Association for Computational Linguistics (ACL) 2023

Ref ID: 5154

Multilingual Knowledge Graph Completion (mKGC) aim at solving queries like (h, r, ?) in different languages by reasoning a tail entity t thus improving multilingual knowledge graphs. Previous studies leverage multilingual pretrained language models (PLMs) and the generative paradigm to achieve mKGC. Although multilingual pretrained language models contain extensive knowledge of different languages, its pretraining tasks cannot be directly aligned with the mKGC tasks. Moreover, the majority of KGs and PLMs currently available exhibit a pronounced English-centric bias. This makes it difficult for mKGC to achieve good results, particularly in the context of low-resource languages. To overcome previous problems, this paper introduces global and local knowledge constraints for mKGC. The former is used to constrain the reasoning of answer entities, while the latter is used to enhance the representation of query contexts. The proposed method makes the pretrained model better adapt to the mKGC task. Experimental results on public datasets demonstrate that our method outperforms the previous SOTA on Hits@1 and Hits@10 by an average of 12.32% and 16.03%, which indicates that our proposed method has significant enhancement on mKGC. © 2023 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#2392 - Song 2024
Enhancing Text-to-SQL Translation for Financial System Design

Song, Y.; Ezzini, S.; Tang, X.; Lothritz, C.; Klein, J.; Bissyandé, T.; Boytsov, A.; Ble, U.; Goujon, A.

2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) 2024;():252-262

2024

DOI: 10.1145/3639477.3639732 · Ref ID: 6581

Text-to-SQL, the task of translating natural language questions into SQL queries, is part of various business processes. Its automation, which is an emerging challenge, will empower software practitioners to seamlessly interact with relational databases using natural language, thereby bridging the gap between business needs and software capabilities. In this paper, we consider Large Language Models (LLMs), which have achieved state of the art for various NLP tasks. Specifically, we benchmark Text-to-SQL performance, the evaluation methodologies, as well as input optimization (e.g., prompting). In light of the empirical observations that we have made, we propose two novel metrics that were designed to adequately measure the similarity between SQL queries. Overall, we share with the community various findings, notably on how to select the right LLM on Text-to-SQL tasks. We further demonstrate that a tree-based edit distance constitutes a reliable metric for assessing the similarity between generated SQL queries and the oracle for benchmarking Text2SQL approaches. This metric is important as it relieves researchers from the need to perform computationally expensive experiments such as executing generated queries as done in prior works. Our work implements financial domain use cases and, therefore contributes to the advancement of Text2SQL systems and their practical adoption in this domain.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1851 - Song 2024
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI

Song, Y.; Sun, P.; Liu, H.; Li, Z.; Song, W.; Xiao, Y.; Zhou, X.

IEEE Trans Knowl Data Eng 2024;36(11):6962-6976

2024

DOI: 10.1109/TKDE.2024.3399746 · Ref ID: 4623

Embodied AI is one of the most popular studies in artificial intelligence and robotics, which can effectively improve the intelligence of real-world agents (i.e. robots) serving human beings. Scene knowledge is important for an agent to understand the surroundings and make correct decisions in the varied open world. Currently, knowledge base for embodied tasks is missing and most existing work use general knowledge base or pre-trained models to enhance the intelligence of an agent. For conventional knowledge base, it is sparse, insufficient in capacity and cost in data collection. For pre-trained models, they face the uncertainty of knowledge and hard maintenance. To overcome the challenges of scene knowledge, we propose a scene-driven multimodal knowledge graph (Scene-MMKG) construction method combining conventional knowledge engineering and large language models. A unified scene knowledge injection framework is introduced for knowledge representation. To evaluate the advantages of our proposed method, we instantiate Scene-MMKG considering typical indoor robotic functionalities (Manipulation and Mobility), named ManipMob-MMKG. Comparisons in characteristics indicate our instantiated ManipMob-MMKG has broad superiority on data-collection efficiency and knowledge quality. Experimental results on typical embodied tasks show that knowledge-enhanced methods using our instantiated ManipMob-MMKG can improve the performance obviously without re-designing model structures complexly. © 2024 IEEE.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#619 - Sovrano 2023
An objective metric for Explainable AI: How and why to estimate the degree of explainability

Sovrano, F.; Vitali, F.

Knowledge-Based Syst. 2023;278():23

2023

DOI: 10.1016/j.knosys.2023.110866 · Ref ID: 3759

This paper presents a new method for objectively measuring the explainability of textual information, such as the outputs of Explainable AI (XAI). We introduce a metric called Degree of Explainability (DoX), drawing inspiration from Ordinary Language Philosophy and Achinstein's theory of explanations. It assumes that the degree of explainability is directly proportional to the number of relevant questions that a piece of information can correctly answer. We have operationalized this concept by formalizing the DoX metric through a mathematical formula, which we have integrated into a software tool named DoXpy. DoXpy relies on pre-trained deep language models for knowledge extraction and answer retrieval in order to estimate the DoX, transforming our theoretical insights into a practical tool for real-world applications. To confirm the effectiveness and consistency of our approach, we conducted comprehensive experiments and user studies with over 190 participants. These studies evaluated the quality of explanations by healthcare and finance XAI-based software systems. Our results demonstrate a correlation between increases in objective explanation usability and increments in the DoX score. These findings suggest that the DoX metric is congruent with other mainstream explainability measures. It provides a more objective and cost-effective alternative to non-deterministic user studies. Thus, we discuss the potential of DoX as a tool to evaluate the legal compliance of XAI systems. By bridging the gap between theory and practice in Explainable AI, our work fosters transparency, understandability, and legal compliance. DoXpy and related materials have been made available online to ensure reproducibility. & COPY; 2023 Elsevier B.V. All rights reserved.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1295 - Sreekantan 2022
Expert System for Question Answering on Anomalous Events and Mitigation Strategies Using Bidirectional Transformers and Knowledge Graphs

Sreekantan, J.; Hutchison, C.; Amatya, P.

Society of Petroleum Engineers - ADIPEC 2022 2022;():

Society of Petroleum Engineers 2022

DOI: 10.2118/211855-MS · Ref ID: 5422

Daily drilling reports provide vital information for well planning as they capture anomalous events and mitigation measures during drilling operations. Previous works predominantly focus on search frameworks for information retrieval from these reports. However, the context between searches is lost, preventing users from narrowing down to the exact answer. Here, we present a transformer-based closed domain conversational agent for longer dialogues to guide users to contextual information for anomalous drilling events through natural language. Automated text extraction, cleaning and validation tasks are initially performed to resolve data quality issues prior to language modeling on a validated data set. Subsequently, a knowledge-based graph is created by node embedding using entity extractions and by learning the semantic-level relationships between entity nodes such as well names and events. Further, conversational agents are trained on the knowledge graphs for natural dialogue generation using neural machine translation models. Here, users' questions are translated into a query in a structured language that is evaluated directly over the knowledge graph in order to generate the desired answers. The workflow was tested on an asset with multiple wells experiencing several anomalous events during drilling such as stuck pipe, circulation losses and kicks. The end-to-end workflow was tested on its ability to retrieve anomalous events and present mitigation measures in the aforementioned data set based on the descriptions input by survey participants. Performance on the anomaly extraction, attribute mapping and mitigation performance were evaluated through F1 scores. A significantly high F1 score was recorded for anomaly extraction. This is predominantly driven by high precision due to explicit modeling of the reports as a knowledge graph. In addition to testing the workflow end to end, we tested the knowledge graph representation in isolation. For this, ranking metrics and triple classification with negative samples were used for the evaluation. The adjusted mean rank index was close to one, indicating high performance. Structured querying on the knowledge graphs also showed high accuracy for classifying anomalous events in the drilling report. The work described in this paper automates the end-to-end workflow for building an expert system for answering questions about anomalous events and mitigation strategies using daily drilling reports. Our novel approach using a knowledge graph with a transformer-based conversational agent enables users to perform detailed interactive investigation of anomalous events observed in daily drilling reports and create mitigation strategies. The workflow also allows for incorporating prior domain knowledge from drilling experts. Copyright © 2022, Society of Petroleum Engineers.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#3549 - Stankevich 2024
Interpreting and learning voice commands with a Large Language Model for a robot system

Stankevich, Stanislau; Dudek, Wojciech

arXiv 2024;():

2024

Ref ID: 8498

Robots are increasingly common in industry and daily life, such as in nursing homes where they can assist staff. A key challenge is developing intuitive interfaces for easy communication. The use of Large Language Models (LLMs) like GPT-4 has enhanced robot capabilities, allowing for real-time interaction and decision-making. This integration improves robots' adaptability and functionality. This project focuses on merging LLMs with databases to improve decision-making and enable knowledge acquisition for request interpretation problems.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1241 - Stavropoulos 2023
Empowering Knowledge Discovery from Scientific Literature: A novel approach to Research Artifact Analysis

Stavropoulos, P.; Lyris, I.; Manola, N.; Grypari, I.; Papageorgiou, H.

3rd Workshop for Natural Language Processing Open Source Software, NLP-OSS 2023, Proceedings of the Workshop 2023;():37-53

Association for Computational Linguistics (ACL) 2023

DOI: 10.18653/v1/2023.nlposs-1.5 · Ref ID: 4916

Knowledge extraction from scientific literature is a major issue, crucial to promoting transparency, reproducibility, and innovation in the research community. In this work, we present a novel approach towards the identification, extraction and analysis of dataset and code/software mentions within scientific literature. We introduce a comprehensive dataset, synthetically generated by ChatGPT and meticulously curated, augmented, and expanded with real snippets of scientific text from full-text publications in Computer Science using a human-in-the-loop process. The dataset contains snippets highlighting mentions of the two research artifact (RA) types: dataset and code/software, along with insightful metadata including their Name, Version, License, URL as well as the intended Usage and Provenance. We also fine-tune a simple Large Language Model (LLM) using Low-Rank Adaptation (LoRA) to transform the Research Artifact Analysis (RAA) into an instruction-based Question Answering (QA) task. Ultimately, we report the improvements in performance on the test set of our dataset when compared to other base LLM models. Our method provides a significant step towards facilitating accurate, effective, and efficient extraction of datasets and software from scientific papers, contributing to the challenges of reproducibility and reusability in scientific research. © 2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#877 - Steenwinckel 2021
Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge Graphs

Steenwinckel, B.; Vandewiele, G.; Bonte, P.; Weyns, M.; Paulheim, H.; Ristoski, P.; De Turck, F.; Ongenae, F.

32nd International Conference on Database and Expert Systems Applications (DEXA) 2021;1479():70-80

Electr Network Springer International Publishing Ag 2021

DOI: 10.1007/978-3-030-87101-7_8 · Ref ID: 3019

As Knowledge Graphs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modeling techniques. The original work proposed the Weisfeiler-Lehman kernel to improve the quality of the representations. However, in this work, we show that the Weisfeiler-Lehman kernel does little to improve walk embeddings in the context of a single Knowledge Graph. As an alternative, we examined five alternative strategies to extract information complementary to basic random walks and compare them on several benchmark datasets to show that research within this field is still relevant for node classification tasks.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2173 - Steinegger 2016
Automatic generation of diagnostic handling code for decentralized PLC-based control architectures

Steinegger, M.; Melik-Merkumians, M.; Zajc, J.; Schitter, G.

2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA) 2016;():1-8

2016

DOI: 10.1109/ETFA.2016.7733694 · Ref ID: 7067

In this paper, an ontology-based approach to automatically generate control applications to handle diagnostic information of decentralized control devices is presented. Diagnostic possibilities of modern remote I/O devices are analyzed and software components in terms of function blocks to handle the specific diagnostic information are defined. After a detailed conceptual overview, the application of the proposed knowledge-based code generation approach to a PLC-based control architecture of a hot rolling mill is described. It is shown that the proposed approach significantly reduces engineering time and the error rate in the design processes of industrial control and diagnostic applications, since the application engineering is raised to an abstract level by utilizing pre-defined, tested, and reusable function blocks and a user-definable set of code generation rules to encode repetitive implementation tasks. The rules are defined in the query language SPARQL with additional ARQ functions to reduce the complexity.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3418 - Steinigen 2024
Fact Finder – Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs

Steinigen, Daniel; Teucher, Roman; Ruland, Timm Heine; Rudat, Max; Flores-Herr, Nicolas; Fischer, Peter; Milosevic, Nikola; Schymura, Christopher; Ziletti, Angelo

arXiv 2024;():

2024

Ref ID: 8511

Recent advancements in Large Language Models (LLMs) have showcased their proficiency in answering natural language queries. However, their effectiveness is hindered by limited domain-specific knowledge, raising concerns about the reliability of their responses. We introduce a hybrid system that augments LLMs with domain-specific knowledge graphs (KGs), thereby aiming to enhance factual correctness using a KG-based retrieval approach. We focus on a medical KG to demonstrate our methodology, which includes (1) pre-processing, (2) Cypher query generation, (3) Cypher query processing, (4) KG retrieval, and (5) LLM-enhanced response generation. We evaluate our system on a curated dataset of 69 samples, achieving a precision of 78% in retrieving correct KG nodes. Our findings indicate that the hybrid system surpasses a standalone LLM in accuracy and completeness, as verified by an LLM-as-a-Judge evaluation method. This positions the system as a promising tool for applications that demand factual correctness and completeness, such as target identification – a critical process in pinpointing biological entities for disease treatment or crop enhancement. Moreover, its intuitive search interface and ability to provide accurate responses within seconds make it well-suited for time-sensitive, precision-focused research contexts. We publish the source code together with the dataset and the prompt templates used.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#2594 - Steinmetz 2018
Internet of Things Ontology for Digital Twin in Cyber Physical Systems

Steinmetz, C.; Rettberg, A.; Ribeiro, F. G. C.; Schroeder, G.; Pereira, C. E.

2018 VIII Brazilian Symposium on Computing Systems Engineering (SBESC) 2018;():154-159

2018

DOI: 10.1109/SBESC.2018.00030 · Ref ID: 6757

The Digital Twin is one of the most important concepts in the Cyber Physical Systems (CPS) era. It can bring benefits such as simulation, monitoring or management once it joins the physical and the virtual through the Internet of Things. This concept is being adopted more and more in the academia and in the industry, but there is still a lack of methods to define and formalize the representation of the Digital Twin, as for example semantic models. Ontologies are a way of representing knowledge that can be shared between different entities, allowing a common understanding about a information. In this sense, this work proposes an ontology to represent Digital Twin in the context of CPS and embedded systems. These concepts are implemented through a proposed architecture. The proposed ideas are being evaluated with industrial case studies and some of the preliminary results are described in the paper.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3078 - Štolc 2010
A visual based framework for the model refactoring techniques

Štolc, M.; Polášek, I.

2010 IEEE 8th International Symposium on Applied Machine Intelligence and Informatics (SAMI) 2010;():72-82

2010

DOI: 10.1109/SAMI.2010.5423766 · Ref ID: 6520

Refactoring is one of the most important rules and practices of Extreme Programming from the family of the Agile Methodologies. We propose the tool to refactor the UML model (Class Diagrams for now). In the first step we need to find the flaws (bad smells) in the model with the OCL query and then in the second step we transform the flaw to the correct fragment with the transformation script. The paper presents the set of methods and tools for the model adjustment, cooperating with the CASE systems. We analyze the concept and algorithms for the refactoring, OCL queries and transformation scripts generating. We have prepared functional prototype of the editor for the refactoring rules definition, OCL query generator and the transformation script generator. In the future, we plan to extend the framework with alternative notations (e.g., QVT graph transformation rules, PICS, Viatra2) and the other techniques to find the flaws (e.g., rule-based system with predicates of the bad smells, XMI transformations and Abstract Syntax Tree algebra, Bit-Vector and Similarity Scoring Algorithms).

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3034 - Stramandinoli 2011
Towards the grounding of abstract words: A Neural Network model for cognitive robots

Stramandinoli, F.; Cangelosi, A.; Marocco, D.

The 2011 International Joint Conference on Neural Networks 2011;():467-474

2011

DOI: 10.1109/IJCNN.2011.6033258 · Ref ID: 6277

In this paper, a model based on Artificial Neural Networks (ANNs) extends the symbol grounding mechanism to abstract words for cognitive robots. The aim of this work is to obtain a semantic representation of abstract concepts through the grounding in sensorimotor experiences for a humanoid robotic platform. Simulation experiments have been developed on a software environment for the iCub robot. Words that express general actions with a sensorimotor component are first taught to the simulated robot. During the training stage the robot first learns to perform a set of basic action primitives through the mechanism of direct grounding. Subsequently, the grounding of action primitives, acquired via direct sensorimotor experience, is transferred to higher-order words via linguistic descriptions. The idea is that by combining words grounded in sensorimotor experience the simulated robot can acquire more abstract concepts. The experiments aim to teach the robot the meaning of abstract words by making it experience sensorimotor actions. The iCub humanoid robot will be used for testing experiments on a real robotic architecture.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2233 - Su 2009
A Chinese Document Retrieval Method Enhanced by Concept Base

Su, J.; Weng, W.; Wang, Z.

2009 WRI World Congress on Computer Science and Information Engineering 2009;5():200-203

2009

DOI: 10.1109/CSIE.2009.496 · Ref ID: 6440

Full-text searching techniques have been extensively used in the area of information retrieval. However, the full-text searching techniques are often insufficient to retrieve meaningful or valuable documents since the basic idea of these techniques is word or phrase matching, not concept matching. A Chinese document retrieval method enhanced by concept base is proposed in this paper. The main idea of this method is to build a common Chinese concept base to provide a shared understanding of concepts. This enhanced method can take advantage of the concept base when analyzing and indexing documents, and when searching documents. The document management system can use this method to improve the retrieval performance.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2018 - Su 2024
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models

Su, W.; Wang, C.; Ai, Q.; Hu, Y.; Wu, Z.; Zhou, Y.; Liu, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14379-14391

Association for Computational Linguistics (ACL) 2024

Ref ID: 4404

Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, which tend to be computationally intensive and limited in effectiveness due to their separation from the LLM's inference process. To overcome these limitations, we introduce MIND, an unsupervised training framework that leverages the internal states of LLMs for real-time hallucination detection without requiring manual annotations. Additionally, we present HELM, a new benchmark for evaluating hallucination detection across multiple LLMs, featuring diverse LLM outputs and the internal states of LLMs during their inference process. Our experiments demonstrate that MIND outperforms existing state-of-the-art methods in hallucination detection. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1637 - Su 2023
MeKB-Rec: Personal Knowledge Graph Learning for Cross-Domain Recommendation

Su, X.; Zhou, Y.; Shan, Z.; Chen, Q.

CEUR Workshop Proceedings 2023;3560():90-102

CEUR-WS 2023

Ref ID: 5028

It is a long-standing challenge in modern recommender systems to make recommendations for new users, namely the cold-start problem. Cross-Domain Recommendation (CDR) has been proposed to address this challenge, but current ways to represent users’ interests across systems are still severely limited. We introduce Personal Knowledge Graph (PKG) as a domain-invariant interest representation, and propose a novel CDR paradigm named MeKB-Rec. We first link users and entities in a knowledge base to construct a PKG of users’ interests, named MeKB. Then we learn a semantic representation of MeKB for the cross-domain recommendation. Beyond most existing systems, our approach builds a semantic mapping across domains using Pretrained Language Models which breaks the requirement for in-domain user behaviors, enabling zero-shot recommendations for new users in a low-resource domain. We experiment MeKB-Rec on well-established public CDR datasets, and demonstrate that the new formulation achieves a new state-of-the-art that significantly improves HR@10 and NDCG@10 metrics over best previous approaches by 24%–91%, with a 105% improvement for HR@10 of zero-shot users with no behavior in the target domain. We deploy MeKB-Rec in WeiXin recommendation scenarios and achieve significant gains in core online metrics. MeKB-Rec is now serving hundreds of millions of users in real-world products. © 2023 Copyright for this paper by its authors.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1253 - Su 2024
Enhancing Exploratory Testing by Large Language Model and Knowledge Graph

Su, Y.; Liao, D.; Xing, Z.; Huang, Q.; Xie, M.; Lu, Q.; Xu, X.

Proceedings - International Conference on Software Engineering 2024;():1197-1208

IEEE Computer Society 2024

DOI: 10.1145/3597503.3639157 · Ref ID: 4640

Exploratory testing leverages the tester's knowledge and creativity to design test cases for effectively uncovering system-level bugs from the end user's perspective. Researchers have worked on test scenario generation to support exploratory testing based on a system knowledge graph, enriched with scenario and oracle knowledge from bug reports. Nevertheless, the adoption of this approach is hindered by difficulties in handling bug reports of inconsistent quality and varied expression styles, along with the infeasibility of the generated test scenarios. To overcome these limitations, we utilize the superior natural language understanding (NLU) capabilities of Large Language Models (LLMs) to construct a System KG of User Tasks and Failures (SysKG-UTF). Leveraging the system and bug knowledge from the KG, along with the logical reasoning capabilities of LLMs, we generate test scenarios with high feasibility and coherence. Particularly, we design chain-of-thought (CoT) reasoning to extract human-like knowledge and logical reasoning from LLMs, simulating a developer's process of validating test scenario feasibility. Our evaluation shows that our approach significantly enhances the KG construction, particularly for bug reports with low quality. Furthermore, our approach generates test scenarios with high feasibility and coherence. The user study further proves the effectiveness of our generated test scenarios in supporting exploratory testing. Specifically, 8 participants find 36 bugs from 8 seed bugs in two hours using our test scenarios, a significant improvement over the 21 bugs found by the state-of-the-art baseline. © 2024 ACM.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1622 - Subramanian 2024
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering

Subramanian, A.; Schlegel, V.; Kashyap, A. R.; Nguyen, T. T.; Dwivedi, V. P.; Winkler, S.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():4002-4042

Association for Computational Linguistics (ACL) 2024

Ref ID: 4403

There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain-a fundamental pre-requisite for success on downstream tasks. Addressing this gap, we use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains. Our multifaceted analysis of the performance of 15 LLMs, further broken down by sub-domain, source of knowledge and model architecture, uncovers success factors such as instruction tuning that lead to improved recall and comprehension. We further show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results, even generalising to unseen specialist sub-domains. We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented context. To foster research and collaboration in this field we share M-QALM-our resources, standard-ised methodology, and evaluation results-with the research community to facilitate further advancements in clinical knowledge representation learning within language models. © 2024 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1492 - Suchanek 2023
Knowledge Bases and Language Models: Complementing Forces

Suchanek, F.; Luu, A. T.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14244 LNCS():3-15

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-45072-3_1 · Ref ID: 5185

Large language models (LLMs), as a particular instance of generative artificial intelligence, have revolutionized natural language processing. In this invited paper, we argue that LLMs are complementary to structured data repositories such as databases or knowledge bases, which use symbolic knowledge representations. Hence, the two ways of knowledge representation will likely continue to co-exist, at least in the near future. We discuss ways that have been explored to make the two approaches work together, and point out opportunities and challenges for their symbiosis. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3559 - Sukhwal 2024
A Joint-Reasoning based Disease Q&amp;A System

Sukhwal, Prakash Chandra; Rajan, Vaibhav; Kankanhalli, Atreyi

arXiv 2024;():

2024

Ref ID: 8027

Medical question answer (QA) assistants respond to lay users' health-related queries by synthesizing information from multiple sources using natural language processing and related techniques. They can serve as vital tools to alleviate issues of misinformation, information overload, and complexity of medical language, thus addressing lay users' information needs while reducing the burden on healthcare professionals. QA systems, the engines of such assistants, have typically used either language models (LMs) or knowledge graphs (KG), though the approaches could be complementary. LM-based QA systems excel at understanding complex questions and providing well-formed answers, but are prone to factual mistakes. KG-based QA systems, which represent facts well, are mostly limited to answering short-answer questions with pre-created templates. While a few studies have jointly used LM and KG approaches for text-based QA, this was done to answer multiple-choice questions. Extant QA systems also have limitations in terms of automation and performance. We address these challenges by designing a novel, automated disease QA system which effectively utilizes both LM and KG techniques through a joint-reasoning approach to answer disease-related questions appropriate for lay users. Our evaluation of the system using a range of quality metrics demonstrates its efficacy over benchmark systems, including the popular ChatGPT.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1001 - Sumanathilaka 2024
Assessing GPT's Potential for Word Sense Disambiguation: A Quantitative Evaluation on Prompt Engineering Techniques

Sumanathilaka, D.; Micallef, N.; Hough, J.

2024 IEEE 15th Control and System Graduate Research Colloquium, ICSGRC 2024 - Conference Proceeding 2024;():204-209

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICSGRC62081.2024.10691283 · Ref ID: 4163

Modern digital communications (including social media content) often contain ambiguous words due to their potential for multiple related interpretations (polysemy). This ambiguity poses challenges for traditional Word Sense Disambiguation (WSD) methods, which struggle with limited data and lack of contextual understanding. These limitations hinder efficient translation, information retrieval, and question-answering systems, thereby restricting the benefits of computational linguistics techniques when applied to digital communication technologies. Our research investigates the use of Large Language Models (LLMs) to improve WSD using various prompt engineering techniques. We propose and evaluate a novel method that combines a knowledge graph, together with Part-of-Speech (POS) tagging and few-shot prompting to guide LLMs. By utilizing prompt augmentation with human-in-loop on few-shot prompt approaches, this work demonstrates a substantial improvement in WSD. This research advances accurate word interpretation in digital communications, leading to important implications for improved translation systems, better search results, and more intelligent question-answering technology. © 2024 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3204 - Sumpter 2024
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models

Sumpter, Scott

arXiv 2024;():

2024

Ref ID: 8265

This study introduces a transformative framework for medical education by integrating semi-structured data with Large Language Models (LLMs), primarily OpenAIs ChatGPT3.5, to automate the creation of medical simulation scenarios. Traditionally, developing these scenarios was a time-intensive process with limited flexibility to meet diverse educational needs. The proposed approach utilizes AI to efficiently generate detailed, clinically relevant scenarios that are tailored to specific educational objectives. This innovation has significantly reduced the time and resources required for scenario development, allowing for a broader variety of simulations. Preliminary feedback from educators and learners has shown enhanced engagement and improved knowledge acquisition, confirming the effectiveness of this AI-enhanced methodology in simulation-based learning. The integration of structured data with LLMs not only streamlines the creation process but also offers a scalable, dynamic solution that could revolutionize medical training, highlighting the critical role of AI in advancing educational outcomes and patient care standards.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#161 - Sun 2021
Deep learning with language models improves named entity recognition for PharmaCoNER

Sun, C.; Yang, Z. H.; Wang, L.; Zhang, Y.; Lin, H. F.; Wang, J.

BMC Bioinformatics 2021;22(SUPPL 1):16

2021

DOI: 10.1186/s12859-021-04260-y · Ref ID: 3281

Background The recognition of pharmacological substances, compounds and proteins is essential for biomedical relation extraction, knowledge graph construction, drug discovery, as well as medical question answering. Although considerable efforts have been made to recognize biomedical entities in English texts, to date, only few limited attempts were made to recognize them from biomedical texts in other languages. PharmaCoNER is a named entity recognition challenge to recognize pharmacological entities from Spanish texts. Because there are currently abundant resources in the field of natural language processing, how to leverage these resources to the PharmaCoNER challenge is a meaningful study. Methods Inspired by the success of deep learning with language models, we compare and explore various representative BERT models to promote the development of the PharmaCoNER task. Results The experimental results show that deep learning with language models can effectively improve model performance on the PharmaCoNER dataset. Our method achieves state-of-the-art performance on the PharmaCoNER dataset, with a max F1-score of 92.01%. Conclusion For the BERT models on the PharmaCoNER dataset, biomedical domain knowledge has a greater impact on model performance than the native language (i.e., Spanish). The BERT models can obtain competitive performance by using WordPiece to alleviate the out of vocabulary limitation. The performance on the BERT model can be further improved by constructing a specific vocabulary based on domain knowledge. Moreover, the character case also has a certain impact on model performance.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3761 - Sun 2024
Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement

Sun, Chenkai; Yang, Ke; Reddy, Revanth Gangi; Fung, Yi R.; Chan, Hou Pong; Small, Kevin; Zhai, ChengXiang; Ji, Heng

arXiv 2024;():

2024

Ref ID: 8109

The increasing demand for personalized interactions with large language models (LLMs) calls for methodologies capable of accurately and efficiently identifying user opinions and preferences. Retrieval augmentation emerges as an effective strategy, as it can accommodate a vast number of users without the costs from fine-tuning. Existing research, however, has largely focused on enhancing the retrieval stage and devoted limited exploration toward optimizing the representation of the database, a crucial aspect for tasks such as personalization. In this work, we examine the problem from a novel angle, focusing on how data can be better represented for more data-efficient retrieval in the context of LLM customization. To tackle this challenge, we introduce Persona-DB, a simple yet effective framework consisting of a hierarchical construction process to improve generalization across task contexts and collaborative refinement to effectively bridge knowledge gaps among users. In the evaluation of response prediction, Persona-DB demonstrates superior context efficiency in maintaining accuracy with a significantly reduced retrieval size, a critical advantage in scenarios with extensive histories or limited context windows. Our experiments also indicate a marked improvement of over 10% under cold-start scenarios, when users have extremely sparse data. Furthermore, our analysis reveals the increasing importance of collaborative knowledge as the retrieval capacity expands.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3594 - Sun 2024
Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback

Sun, Jingwei; Du, Zhixu; Chen, Yiran

arXiv 2024;():

2024

Ref ID: 8330

Large language models (LLMs) have demonstrated remarkable proficiency in a range of natural language processing tasks. Once deployed, LLMs encounter users with personalized factual knowledge, and such personalized knowledge is consistently reflected through users' interactions with the LLMs. To enhance user experience, real-time model personalization is essential, allowing LLMs to adapt user-specific knowledge based on user feedback during human-LLM interactions. Existing methods mostly require back-propagation to finetune the model parameters, which incurs high computational and memory costs. In addition, these methods suffer from low interpretability, which will cause unforeseen impacts on model performance during long-term use, where the user's personalized knowledge is accumulated extensively.To address these challenges, we propose Knowledge Graph Tuning (KGT), a novel approach that leverages knowledge graphs (KGs) to personalize LLMs. KGT extracts personalized factual knowledge triples from users' queries and feedback and optimizes KGs without modifying the LLM parameters. Our method improves computational and memory efficiency by avoiding back-propagation and ensures interpretability by making the KG adjustments comprehensible to humans.Experiments with state-of-the-art LLMs, including GPT-2, Llama2, and Llama3, show that KGT significantly improves personalization performance while reducing latency and GPU memory costs. Ultimately, KGT offers a promising solution of effective, efficient, and interpretable real-time LLM personalization during user interactions with the LLMs.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#32 - Sun 2022
Assessing Scientific Research Papers with Knowledge Graphs

Sun, K. X.; Qiu, Z. Q.; Salinas, A.; Huang, Y. Z.; Lee, D. H.; Benjamin, D.; Morstatter, F.; Ren, X.; Lerman, K.; Pujara, J.; Acm

45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2022;():2467-2472

Madrid, SPAIN Assoc Computing Machinery 2022

DOI: 10.1145/3477495.3531879 · Ref ID: 3066

In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1372 - Sun 2024
Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs?

Sun, K.; Xu, Y. E.; Zha, H.; Liu, Y.; Dong, X. L.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():311-325

Association for Computational Linguistics (ACL) 2024

Ref ID: 4459

Since the recent prosperity of Large Language Models (LLMs), there have been interleaved discussions regarding how to reduce hallucinations from LLM responses, how to increase the factuality of LLMs, and whether Knowledge Graphs (KGs), which store the world knowledge in a symbolic form, will be replaced with LLMs. In this paper, we try to answer these questions from a new angle: How knowledgeable are LLMs? To answer this question, we constructed Head-to-Tail, a benchmark that consists of 18K question-answer (QA) pairs regarding head, torso, and tail facts in terms of popularity. We designed an automated evaluation method and a set of metrics that closely approximate the knowledge an LLM confidently internalizes. Through a comprehensive evaluation of 16 publicly available LLMs, we show that existing LLMs are still far from being perfect in terms of their grasp of factual knowledge, especially for facts of torso-to-tail entities. ©2024 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#260 - Sun 2024
Exploring sequence-to-sequence taxonomy expansion via language model probing

Sun, K.; Yu, J. F.; Li, J. Z.; Hou, L.

Expert Syst. Appl. 2024;239():8

2024

DOI: 10.1016/j.eswa.2023.122321 · Ref ID: 3370

Taxonomy is a knowledge graph of concept hierarchy which plays a significant role in semantic entailment and is widely used in many downstream natural language processing tasks. Distinct from building a taxonomy from scratch, the task of taxonomy expansion aims at enriching an existing taxonomy by adding new concepts. However, existing methods often construct only part of semantic relationships for representing the taxonomy, which may ignore sufficient features. Meanwhile, as many recent models usually take this task in insertion only manner, they preserve limitations when the new concept is not an insertion to taxonomy. Therefore, we propose TaxoSeq, a method that converts the task of taxonomy expansion into a sequence to sequence setting, thereby effectively exploiting the entire structural features and naturally dealing with more expansion cases. Empowered by pre-trained language models such as T5, our approach is shown to achieve significant progress over other methods in SemEval's three publicly benchmark datasets.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1305 - Sun 2024
Exploring sequence-to-sequence taxonomy expansion via language model probing[Formula presented]

Sun, K.; Yu, J.; Li, J.; Hou, L.

Expert Sys Appl 2024;239():

2024

DOI: 10.1016/j.eswa.2023.122321 · Ref ID: 4047

Taxonomy is a knowledge graph of concept hierarchy which plays a significant role in semantic entailment and is widely used in many downstream natural language processing tasks. Distinct from building a taxonomy from scratch, the task of taxonomy expansion aims at enriching an existing taxonomy by adding new concepts. However, existing methods often construct only part of semantic relationships for representing the taxonomy, which may ignore sufficient features. Meanwhile, as many recent models usually take this task in insertion-only manner, they preserve limitations when the new concept is not an insertion to taxonomy. Therefore, we propose TaxoSeq, a method that converts the task of taxonomy expansion into a sequence to sequence setting, thereby effectively exploiting the entire structural features and naturally dealing with more expansion cases. Empowered by pre-trained language models such as T5, our approach is shown to achieve significant progress over other methods in SemEval's three publicly benchmark datasets. © 2023 Elsevier Ltd

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#1695 - Sun 2024
ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs

Sun, L.; Tao, Z.; Li, Y.; Arakawa, H.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():7417-7431

Association for Computational Linguistics (ACL) 2024

Ref ID: 4165

The integration of Large Language Models (LLMs) and knowledge graphs (KGs) has achieved remarkable success in various natural language processing tasks. However, existing methodologies that integrate LLMs and KGs often navigate the task-solving process solely based on the LLM's analysis of the question, overlooking the rich cognitive potential inherent in the vast knowledge encapsulated in KGs. To address this, we introduce Observation-Driven Agent (ODA), a novel AI agent framework tailored for tasks involving KGs. ODA incorporates KG reasoning abilities via global observation, which enhances reasoning capabilities through a cyclical paradigm of observation, action, and reflection. Confronting the exponential explosion of knowledge during observation, we innovatively design a recursive observation mechanism. Subsequently, we integrate the observed knowledge into the action and reflection modules. Through extensive experiments, ODA demonstrates state-of-the-art performance on several datasets, notably achieving accuracy improvements of 12.87% and 8.9%. Our code and data are available on https://github.com/lanjiuqing64/KGdata. © 2024 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3662 - Sun 2024
LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

Sun, Maojun

arXiv 2024;():

2024

Ref ID: 8351

Large language models (LLMs) have shown amazing capabilities in knowledge memorization and the present. However, when it comes to domain-specific knowledge and downstream tasks like medical, general LLMs are often unable to give precise answers. In addition, when people want LLMs to answer classification questions, they usually go through instruction tuning first. However, LLMs do not always give a direct index of the categorization after instruction tuning. In this paper, we proposed LlamaCare, a fine-tuned medical language model, and Extended Classification Integration(ECI), a module to handle classification problems of LLMs. Our contributions are : (i) We fine-tuned a large language model of medical knowledge with very low carbon emissions and achieved similar performance with ChatGPT by a 24G GPU. (ii) We solved the problem of redundant categorical answers and improved the performance of LLMs by proposing a new module called Extended Classification Integration. (iii) We released our processed data for one-shot and few-shot training for some benchmarks such as PubMedQA and USMLE 1-3 step. Our method achieves a close performance comparable to some state-of-the-art models with the same quantity of parameters on benchmarks, while being more environmentally friendly by using less GPU computation time. Our models, codes, and datasets can be found at \url{https://github.com/Stephen-SMJ/LLamaCare}.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1121 - Sun 2024
Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction

Sun, Q.; Huang, K.; Yang, X.; Tong, R.; Zhang, K.; Poria, S.

WWW 2024 - Proceedings of the ACM Web Conference 2024;():4407-4416

Association for Computing Machinery, Inc 2024

DOI: 10.1145/3589334.3645678 · Ref ID: 4030

Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities, inspiring us to explore an alternative approach for obtaining auto-labeled documents with new relations. In this paper, we propose a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework, which Generates labeled data by Retrieval and Denoising Knowledge from LLMs, called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide ChatGPT to generate labeled long-text data step by step. To improve the quality of synthetic data, we propose a denoising strategy based on the consistency of cross-document knowledge. Leveraging our denoised synthetic data, we proceed to fine-tune the LLaMA2-13B-Chat for extracting document-level relation triplets. We perform experiments for both zero-shot document-level relation and triplet extraction on two public datasets. The experimental results illustrate that our GenRDK framework outperforms strong baselines. © 2024 ACM.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1841 - Sun 2024
Root Cause Analysis for Industrial Process Anomalies through the Integration of Knowledge Graph and Large Language Model

Sun, Q.; Li, Y.; Zhou, C.; Tian, Y. C.

Chinese Control Conference, CCC 2024;():6855-6860

IEEE Computer Society 2024

DOI: 10.23919/CCC63176.2024.10662704 · Ref ID: 4155

Root cause analysis for industrial process anomalies is critical for manufacturing activities. Industrial process alarms can provide crucial information to enable root cause analysis. However, the complex system structure causes a large number of alarms to emerge at the same time. To address this issue, we proposed an approach that utilizes knowledge graphs and large language models to provide comprehensible root cause analysis. Firstly, we extract knowledge such as historical anomalies from catalytic cracking operation manuals to construct an industrial process safety knowledge graph. Then, named entities in each alarm are extracted as keywords to retrieve factual knowledge from the knowledge graph. Finally, factual knowledge will be provided to the large language model as prior knowledge to infer the root cause of anomalies. Experimental results show that the proposed approach can accurately identify the root cause, thereby ensuring the safety of industrial processes. © 2024 Technical Committee on Control Theory, Chinese Association of Automation.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1091 - Sun 2020
CoLAKE: Contextualized Language and Knowledge Embedding

Sun, T.; Shao, Y.; Qiu, X.; Guo, Q.; Hu, Y.; Huang, X.; Zhang, Z.

COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference 2020;():3660-3670

Association for Computational Linguistics (ACL) 2020

Ref ID: 5747

With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models. Few works explore the potential of deep contextualized knowledge representation when injecting knowledge. In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended MLM objective. Instead of injecting only entity embeddings, CoLAKE extracts the knowledge context of an entity from large-scale knowledge bases. To handle the heterogeneity of knowledge context and language context, we integrate them in a unified data structure, word-knowledge graph (WK graph). CoLAKE is pre-trained on large-scale WK graphs with the modified Transformer encoder. We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks. Experimental results show that CoLAKE outperforms previous counterparts on most of the tasks. Besides, CoLAKE achieves surprisingly high performance on our synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language and knowledge representation. © 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1100 - Sun 2023
Combining Structure Embedding and Text Semantics for Efficient Knowledge Graph Completion

Sun, W.; Li, Y.; Yao, J.; Wu, Q.; Liu, K.

Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE 2023;2023-July():317-322

Knowledge Systems Institute Graduate School 2023

DOI: 10.18293/SEKE2023-100 · Ref ID: 5291

Knowledge graph completion plays a crucial role in downstream applications. However, existing methods tend to only rely on the structure or textual information, resulting in suboptimal model performance. Moreover, recent attempts to leverage pre-trained language models to complete knowledge graphs have proved unsatisfactory. To overcome these limitations, we propose a novel model that combines structural embedding and semantic information of the knowledge graph. Compared with previous works based on pre-trained language models, our model can better use the implicit knowledge of pre-trained language models by using relation templates, entity definitions, and learnable tokens. Furthermore, our model employs a multi-head attention mechanism to transform the embedding semantic space of entities and relations obtained from the knowledge graph embedding model, thereby enhancing their expressiveness and unifying the semantic space of both types of information. Finally, we utilize convolutional neural networks to extract features from the matrices created by combining these two types of information for link prediction and triplet classification tasks. Empirical evaluations on two knowledge graph completion datasets demonstrate that our model is effective for both tasks. © 2023 Knowledge Systems Institute Graduate School. All rights reserved.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3785 - Sun 2024
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models

Sun, Yichen; Chu, Zhixuan; Qin, Zhan; Ren, Kui

arXiv 2024;():

2024

Ref ID: 8415

The rapid advancement of Text-to-Image(T2I) generative models has enabled the synthesis of high-quality images guided by textual descriptions. Despite this significant progress, these models are often susceptible in generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this problem, we introduce a novel diffusion-based framework to significantly enhance the alignment of generated images with their corresponding descriptions, addressing the inconsistency between visual output and textual input. Our framework is built upon a comprehensive analysis of inconsistency phenomena, categorizing them based on their manifestation in the image. Leveraging a state-of-the-art large language module, we first extract objects and construct a knowledge graph to predict the locations of these objects in potentially generated images. We then integrate a state-of-the-art controllable image generation model with a visual text generation module to generate an image that is consistent with the original prompt, guided by the predicted object locations. Through extensive experiments on an advanced multimodal hallucination benchmark, we demonstrate the efficacy of our approach in accurately generating the images without the inconsistency with the original prompt. The code can be accessed via https://github.com/TruthAI-Lab/PCIG.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3671 - Sun 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning

Sun, Yuqiang; Wu, Daoyuan; Xue, Yue; Liu, Han; Ma, Wei; Zhang, Lyuye; Liu, Yang; Li, Yingjiu

arXiv 2024;():

2024

Ref ID: 8053

Large language models (LLMs) have demonstrated significant potential in various tasks, including vulnerability detection. However, current efforts in this area are preliminary, lacking clarity on whether LLMs' vulnerability reasoning capabilities stem from the models themselves or external aids such as knowledge retrieval and tooling support. This paper aims to isolate LLMs' vulnerability reasoning from other capabilities, such as vulnerability knowledge adoption, context information retrieval, and structured output generation. We introduce LLM4Vuln, a unified evaluation framework that separates and assesses LLMs' vulnerability reasoning capabilities and examines improvements when combined with other enhancements. We conducted controlled experiments with 97 ground-truth vulnerabilities and 97 non-vulnerable cases in Solidity and Java, testing them in a total of 9,312 scenarios across four LLMs (GPT-4, GPT-3.5, Mixtral, and Llama 3). Our findings reveal the varying impacts of knowledge enhancement, context supplementation, prompt schemes, and models. Additionally, we identified 14 zero-day vulnerabilities in four pilot bug bounty programs, resulting in $3,576 in bounties.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#2842 - Sun 2022
Research and Application of Automatic Text Summarization Technology Based on Deep Learning

Sun, Z.; Meng, X.; Zheng, P.; Zhu, X.; Yang, L.

2022 11th International Conference of Information and Communication Technology (ICTech)) 2022;():225-229

2022

DOI: 10.1109/ICTech55460.2022.00052 · Ref ID: 6391

It takes a lot of time and energy for users to obtain useful information from the massive data generated by the Internet. The text abstract is a refined expression of the content of the article, which can summarize the main content of the article. Text summarization technology can quickly allow users to obtain information that is valuable to them, and to a certain extent alleviate the problem of information overload in the era of big data. In this paper, we use the knowledge enhancement model to learn the semantic relationship of the real world by modeling the entity concept and other prior semantic knowledge in massive data, so as to overcome the disadvantage of using only the original language signal in the previous language model. Then the generative pre-training model is used to solve some specific problems in natural language generation, such as the exposure bias problem. The experimental results show that the model used in this paper works well on the Gigaword and CNN / DailyMail data sets. At the same time, the abstract generated on the nlpcc2017 Chinese abstract data has good accuracy and readability.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3588 - Sun 2024
Knowledge Graph in Astronomical Research with Large Language Models: Quantifying Driving Forces in Interdisciplinary Scientific Discovery

Sun, Zechang; Ting, Yuan-Sen; Liang, Yaobo; Duan, Nan; Huang, Song; Cai, Zheng

arXiv 2024;():

2024

Ref ID: 8347

Identifying and predicting the factors that contribute to the success of interdisciplinary research is crucial for advancing scientific discovery. However, there is a lack of methods to quantify the integration of new ideas and technological advancements in astronomical research and how these new technologies drive further scientific breakthroughs. Large language models, with their ability to extract key concepts from vast literature beyond keyword searches, provide a new tool to quantify such processes. In this study, we extracted concepts in astronomical research from 297,807 publications between 1993 and 2024 using large language models, resulting in a set of 24,939 concepts. These concepts were then used to form a knowledge graph, where the link strength between any two concepts was determined by their relevance through the citation-reference relationships. By calculating this relevance across different time periods, we quantified the impact of numerical simulations and machine learning on astronomical research. The knowledge graph demonstrates two phases of development: a phase where the technology was integrated and another where the technology was explored in scientific discovery. The knowledge graph reveals that despite machine learning has made much inroad in astronomy, there is currently a lack of new concept development at the intersection of AI and Astronomy, which may be the current bottleneck preventing machine learning from further transforming the field of astronomy.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#3622 - Suri 2023
Language Models sounds the Death Knell of Knowledge Graphs

Suri, Kunal; Singh, Atul; Mishra, Prakhar; Rout, Swapna Sourav; Sabapathy, Rajesh

arXiv 2023;():

2023

Ref ID: 7633

Healthcare domain generates a lot of unstructured and semi-structured text. Natural Language processing (NLP) has been used extensively to process this data. Deep Learning based NLP especially Large Language Models (LLMs) such as BERT have found broad acceptance and are used extensively for many applications. A Language Model is a probability distribution over a word sequence. Self-supervised Learning on a large corpus of data automatically generates deep learning-based language models. BioBERT and Med-BERT are language models pre-trained for the healthcare domain. Healthcare uses typical NLP tasks such as question answering, information extraction, named entity recognition, and search to simplify and improve processes. However, to ensure robust application of the results, NLP practitioners need to normalize and standardize them. One of the main ways of achieving normalization and standardization is the use of Knowledge Graphs. A Knowledge Graph captures concepts and their relationships for a specific domain, but their creation is time-consuming and requires manual intervention from domain experts, which can prove expensive. SNOMED CT (Systematized Nomenclature of Medicine – Clinical Terms), Unified Medical Language System (UMLS), and Gene Ontology (GO) are popular ontologies from the healthcare domain. SNOMED CT and UMLS capture concepts such as disease, symptoms and diagnosis and GO is the world's largest source of information on the functions of genes. Healthcare has been dealing with an explosion in information about different types of drugs, diseases, and procedures. This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain. We present experiments using LLMs for the healthcare domain to demonstrate that language models provide the same functionality as knowledge graphs, thereby making knowledge graphs redundant.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3593 - Susanti 2024
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery

Susanti, Yuni; Färber, Michael

arXiv 2024;():

2024

Ref ID: 8487

Causal discovery aims to estimate causal structures among variables based on observational data. Large Language Models (LLMs) offer a fresh perspective to tackle the causal discovery problem by reasoning on the metadata associated with variables rather than their actual data values, an approach referred to as knowledge-based causal discovery. In this paper, we investigate the capabilities of Small Language Models (SLMs, defined as LLMs with fewer than 1 billion parameters) with prompt-based learning for knowledge-based causal discovery. Specifically, we present KG Structure as Prompt, a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning to enhance the capabilities of SLMs. Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach, surpassing most baselines and even conventional fine-tuning approaches trained on full datasets. Our findings further highlight the strong capabilities of SLMs: in combination with knowledge graphs and prompt-based learning, SLMs demonstrate the potential to surpass LLMs with larger number of parameters. Our code and datasets are available on GitHub.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3311 - Takahashi 2024
The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models

Takahashi, Ryosuke; Kamoda, Go; Heinzerling, Benjamin; Sakaguchi, Keisuke; Inui, Kentaro

arXiv 2024;():

2024

Ref ID: 8367

Language models (LMs) encode world knowledge in their internal parameters through training. However, LMs may learn personal and confidential information from the training data, leading to privacy concerns such as data leakage. Therefore, research on knowledge deletion from LMs is essential. This study focuses on the knowledge stored in LMs and analyzes the relationship between the side effects of knowledge deletion and the entities related to the knowledge. Our findings reveal that deleting knowledge related to popular entities can have catastrophic side effects. Furthermore, this research is the first to analyze knowledge deletion in models trained on synthetic knowledge graphs, indicating a new direction for controlled experiments.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3519 - Talukdar 2024
Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity

Talukdar, Wrick; Biswas, Anjanava

arXiv 2024;():

2024

Ref ID: 8517

As Large Language Models (LLMs) become increasingly sophisticated and ubiquitous in natural language processing (NLP) applications, ensuring their robustness, trustworthiness, and alignment with human values has become a critical challenge. This paper presents a novel framework for contextual grounding in textual models, with a particular emphasis on the Context Representation stage. Our approach aims to enhance the reliability and ethical alignment of these models through a comprehensive, context-aware methodology. By explicitly capturing and representing relevant situational, cultural, and ethical contexts in a machine-readable format, we lay the foundation for anchoring a model's behavior within these contexts. Our approach leverages techniques from knowledge representation and reasoning, such as ontologies, semantic web technologies, and logic-based formalisms. We evaluate our framework on real-world textual datasets, demonstrating its effectiveness in improving model performance, fairness, and alignment with human expectations, while maintaining high accuracy. Furthermore, we discuss the other key components of the framework, including context-aware encoding, context-aware learning, interpretability and explainability, and continuous monitoring and adaptation. This research contributes to the growing body of work on responsible AI, offering a practical approach to developing more reliable, trustworthy, and ethically-aligned language models. Our findings have significant implications for the deployment of LLMs in sensitive domains such as healthcare, legal systems, and social services, where contextual understanding is paramount.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#811 - Tan 2022
TEGTOK: Augmenting Text Generation via Task-specific and Open-world Knowledge

Tan, C. H.; Gu, J. C.; Tao, C. Y.; Ling, Z. H.; Xu, C.; Hu, H.; Geng, X. B.; Jiang, D. X.; Assoc Computa, Linguist

60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():1597-1609

Dublin, IRELAND Assoc Computational Linguistics-Acl 2022

Ref ID: 3395

Generating natural and informative texts has been a long-standing problem in NLP. Much effort has been dedicated into incorporating pre-trained language models (PLMs) with various open-world knowledge, such as knowledge graphs or wiki pages. However, their ability to access and manipulate the task-specific knowledge is still limited on downstream tasks, as this type of knowledge is usually not well covered in PLMs and is hard to acquire. To address the problem, we propose augmenting TExt Generation via Task-specific and Open-world Knowledge ( TEGTOK) in a unified framework. Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively on the basis of PLMs. With the help of these two types of knowledge, our model can learn what and how to generate. Experiments on two text generation tasks of dialogue generation and question generation, and on two datasets show that our method achieves better performance than various baseline models.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1887 - Tan 2024
Small Models, Big Insights: Leveraging Slim Proxy Models to Decide When and What to Retrieve for LLMs

Tan, J.; Dou, Z.; Zhu, Y.; Guo, P.; Fang, K.; Wen, J. R.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4420-4436

Association for Computational Linguistics (ACL) 2024

Ref ID: 4348

The integration of large language models (LLMs) and search engines represents a significant evolution in knowledge acquisition methodologies. However, determining the knowledge that an LLM already possesses and the knowledge that requires the help of a search engine remains an unresolved issue. Most existing methods solve this problem through the results of preliminary answers or reasoning done by the LLM itself, but this incurs excessively high computational costs. This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in LLMs with a slim proxy model, to enhance the LLM's knowledge acquisition process. We employ a proxy model which has far fewer parameters, and take its answers as heuristic answers. Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM. We only conduct retrieval for the missing knowledge in questions that the LLM does not know. Extensive experimental results on five datasets with two LLMs demonstrate a notable improvement in the end-to-end performance of LLMs in question-answering tasks, achieving or surpassing current state-of-the-art models with lower LLM inference costs. © 2024 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#363 - Tan 2023
Incorporating entity-level knowledge in pretrained language model for biomedical dense retrieval

Tan, J. J.; Hu, J. L.; Dong, S. B.

Comput. Biol. Med. 2023;166():10

2023

DOI: 10.1016/j.compbiomed.2023.107535 · Ref ID: 3427

In recent years, pre-trained language models (PLMs) have dominated natural language processing (NLP) and achieved outstanding performance in various NLP tasks, including dense retrieval based on PLMs. However, in the biomedical domain, the effectiveness of dense retrieval models based on PLMs still needs to be improved due to the diversity and ambiguity of entity expressions caused by the enrichment of biomedical entities. To alleviate the semantic gap, in this paper, we propose a method that incorporates external knowledge at the entity level into a dense retrieval model to enrich the dense representations of queries and documents. Specifically, we first add additional self-attention and information interaction modules in the Transformer layer of the BERT archi-tecture to perform fusion and interaction between query/document text and entity embeddings from knowledge graphs. We then propose an entity similarity loss to constrain the model to better learn external knowledge from entity embeddings, and further propose a weighted entity concatenation mechanism to balance the impact of entity representations when matching queries and documents. Experiments on two publicly available biomedical retrieval datasets show that our proposed method outperforms state-of-the-art dense retrieval methods. In term of NDCG metrics, the proposed method (called ELK) improves the ranking performance of coCondenser by at least 5% on both two datasets, and also obtains further performance gain over state-of-the-art EVA methods. Though having a more sophisticated architecture, the average query latency of ELK is still within the same order of magnitude as that of other efficient methods.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3770 - Tan 2019
Positional Attention-based Frame Identification with BERT: A Deep Learning Approach to Target Disambiguation and Semantic Frame Selection

Tan, Sang-Sang; Na, Jin-Cheon

arXiv 2019;():

2019

Ref ID: 7380

Semantic parsing is the task of transforming sentences from natural language into formal representations of predicate-argument structures. Under this research area, frame-semantic parsing has attracted much interest. This parsing approach leverages the lexical information defined in FrameNet to associate marked predicates or targets with semantic frames, thereby assigning semantic roles to sentence components based on pre-specified frame elements in FrameNet. In this paper, a deep neural network architecture known as Positional Attention-based Frame Identification with BERT (PAFIBERT) is presented as a solution to the frame identification subtask in frame-semantic parsing. Although the importance of this subtask is well-established, prior research has yet to find a robust solution that works satisfactorily for both in-domain and out-of-domain data. This study thus set out to improve frame identification in light of recent advancements of language modeling and transfer learning in natural language processing. The proposed method is partially empowered by BERT, a pre-trained language model that excels at capturing contextual information in texts. By combining the language representation power of BERT with a position-based attention mechanism, PAFIBERT is able to attend to target-specific contexts in sentences for disambiguating targets and associating them with the most suitable semantic frames. Under various experimental settings, PAFIBERT outperformed existing solutions by a significant margin, achieving new state-of-the-art results for both in-domain and out-of-domain benchmark test sets.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1540 - Tanaka 2024
KnowledgeHub: An End-to-End Tool for Assisted Scientific Discovery

Tanaka, S.; Barry, J.; Kuruvanthodi, V.; Moses, M.; Giammona, M. J.; Herr, N.; Elkaref, M.; De Mel, G.

IJCAI International Joint Conference on Artificial Intelligence 2024;():8815-8819

International Joint Conferences on Artificial Intelligence 2024

Ref ID: 4378

This paper describes the KnowledgeHub tool, a scientific literature Information Extraction (IE) and Question Answering (QA) pipeline. This is achieved by supporting the ingestion of PDF documents that are converted to text and structured representations. An ontology can then be constructed where a user defines the types of entities and relationships they want to capture. A browser-based annotation tool enables annotating the contents of the PDF documents according to the ontology. Named Entity Recognition (NER) and Relation Classification (RC) models can be trained on the resulting annotations and can be used to annotate the unannotated portion of the documents. A knowledge graph is constructed from these entity and relation triples which can be queried to obtain insights from the data. Furthermore, we integrate a suite of Large Language Models (LLMs) that can be used for QA and summarisation that is grounded in the included documents via a retrieval component. KnowledgeHub is a unique tool that supports annotation, IE and QA, which gives the user full insight into the knowledge discovery pipeline. © 2024 International Joint Conferences on Artificial Intelligence. All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#756 - Tang 2024
Semantic-aware entity alignment for low resource language knowledge graph

Tang, J. F.; Song, R.; Huang, Y. X.; Gao, S. X.; Yu, Z. T.

Front.. Comput. Sci. 2024;18(4):10

2024

DOI: 10.1007/s11704-023-2542-x · Ref ID: 3201

Entity alignment (EA) is an important technique aiming to find the same real entity between two different source knowledge graphs (KGs). Current methods typically learn the embedding of entities for EA from the structure of KGs for EA. Most EA models are designed for rich-resource languages, requiring sufficient resources such as a parallel corpus and pre-trained language models. However, low-resource language KGs have received less attention, and current models demonstrate poor performance on those low-resource KGs. Recently, researchers have fused relation information and attributes for entity representations to enhance the entity alignment performance, but the relation semantics are often ignored. To address these issues, we propose a novel Semantic-aware Graph Neural Network (SGNN) for entity alignment. First, we generate pseudo sentences according to the relation triples and produce representations using pre-trained models. Second, our approach explores semantic information from the connected relations by a graph neural network. Our model captures expanded feature information from KGs. Experimental results using three low-resource languages demonstrate that our proposed SGNN approach out performs better than state-of-the-art alignment methods on three proposed datasets and three public datasets.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3483 - Tang 2024
GraphArena: Benchmarking Large Language Models on Graph Computational Problems

Tang, Jianheng; Zhang, Qifan; Li, Yuhan; Li, Jia

arXiv 2024;():

2024

Ref ID: 8434

The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of 10 computational tasks, encompassing four polynomial-time (e.g., Shortest Distance) and six NP-complete challenges (e.g., Travelling Salesman Problem). It features a rigorous evaluation framework that classifies LLM outputs as correct, suboptimal (feasible but not optimal), or hallucinatory (properly formatted but infeasible). Evaluation of 10 leading LLMs, including GPT-4o and LLaMA3-70B-Instruct, reveals that even top-performing models struggle with larger, more complex graph problems and exhibit hallucination issues. Despite the application of strategies such as chain-of-thought prompting, these issues remain unresolved. GraphArena contributes a valuable supplement to the existing LLM benchmarks and is open-sourced at https://github.com/squareRoot3/GraphArena.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3263 - Tao 2024
Clue-Guided Path Exploration: Optimizing Knowledge Graph Retrieval with Large Language Models to Address the Information Black Box Challenge

Tao, Dehao; Huang, Feng; Wang, Congqi; Huang, Yongfeng; Jiang, Minghu

arXiv 2024;():

2024

Ref ID: 8045

In recent times, large language models (LLMs) have showcased remarkable capabilities. However, updating their knowledge poses challenges, potentially leading to inaccuracies when confronted with unfamiliar queries. To address this issue, integrating external knowledge bases such as knowledge graphs with large language models is a viable approach. The key challenge lies in extracting the required knowledge from knowledge graphs based on natural language, demanding high semantic understanding. Therefore, researchers are considering leveraging large language models directly for knowledge retrieval from these graphs. Current efforts typically rely on the comprehensive problem-solving capabilities of large language models. We argue that a problem we term the 'information black box' can significantly impact the practical effectiveness of such methods. Moreover, this kind of methods is less effective for scenarios where the questions are unfamiliar to the large language models. In this paper, we propose a Clue-Guided Path Exploration (CGPE) framework to optimize knowledge retrieval based on large language models. By addressing the 'information black box' issue and employing single-task approaches instead of complex tasks, we have enhanced the accuracy and efficiency of using large language models for retrieving knowledge graphs. Experiments on open-source datasets reveal that CGPE outperforms previous methods and is highly applicable to LLMs with fewer parameters. In some instances, even ChatGLM3, with its 6 billion parameters, can rival the performance of GPT-4. Furthermore, the results indicate a minimal invocation frequency of CGPE on LLMs, suggesting reduced computational overhead. For organizations and individuals facing constraints in computational resources, our research offers significant practical value.

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#318 - Taunk 2023
GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering

Taunk, D.; Khanna, L.; Kandru, P.; Varma, V.; Sharma, C.; Tapaswi, M.; Acm

32nd World Wide Web Conference (WWW) 2023;():1138-1144

Austin, TX Assoc Computing Machinery 2023

DOI: 10.1145/3543873.3587651 · Ref ID: 3409

Commonsense question-answering (QA) methods combine the power of pre-trained Language Models (LM) with the reasoning provided by Knowledge Graphs (KG). A typical approach collects nodes relevant to the QA pair from a KG to form a Working Graph (WG) followed by reasoning using Graph Neural Networks (GNNs). This faces two major challenges: (i) it is difficult to capture all the information from the QA in the WG, and (ii) the WG contains some irrelevant nodes from the KG. To address these, we propose GrapeQA with two simple improvements on the WG: (i) Prominent Entities for Graph Augmentation identifies relevant text chunks from the QA pair and augments the WG with corresponding latent representations from the LM, and (ii) Context-Aware Node Pruning removes nodes that are less relevant to the QA pair. We evaluate our results on OpenBookQA, CommonsenseQA and MedQA-USMLE and see that GrapeQA shows consistent improvements over its LM + KG predecessor (QA-GNN in particular) and large improvements on OpenBookQA.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3598 - Teneva 2023
Knowledge Graphs are not Created Equal: Exploring the Properties and Structure of Real KGs

Teneva, Nedelina; Hruschka, Estevam

arXiv 2023;():

2023

Ref ID: 7932

Despite the recent popularity of knowledge graph (KG) related tasks and benchmarks such as KG embeddings, link prediction, entity alignment and evaluation of the reasoning abilities of pretrained language models as KGs, the structure and properties of real KGs are not well studied. In this paper, we perform a large scale comparative study of 29 real KG datasets from diverse domains such as the natural sciences, medicine, and NLP to analyze their properties and structural patterns. Based on our findings, we make several recommendations regarding KG-based model development and evaluation. We believe that the rich structural information contained in KGs can benefit the development of better KG models across fields and we hope this study will contribute to breaking the existing data silos between different areas of research (e.g., ML, NLP, AI for sciences).

Kwesi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#245 - Terron 2023
Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models

Terron, G. A.; Chozas, P. M.; Doncel, V. R.

36th Annual International Conference on Legal Knowledge and Information Systems (JURIX) 2023;379():329-334

Maastricht Univ, Maastricht, NETHERLANDS Ios Press 2023

DOI: 10.3233/faia230983 · Ref ID: 3169

This work uses Large Language Models to process an important piece of Spanish legislation: the Workers' Statute. The proposed method extracts the relevant events in its articles using a GPT-3.5 model and represents the entities involved in the events and the relationships between them as RDF triples. The experiments carried out to select a high-performance strategy include both zero- and few-shot learning tests. Finally, this work proposes a strategy to uplift the extracted legal relations into a legal knowledge graph.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#775 - Thai 2021
Simultaneously Self-Attending to Text and Entities for Knowledge-Informed Text Representations

Thai, D.; Thirukovalluru, R.; Bansal, T.; McCallum, A.; Assoc Computat, Linguist

Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():241-247

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 3407

Pre-trained language models have emerged as highly successful methods for learning good text representations. However, the amount of structured knowledge retained in such models, and how (if at all) it can be extracted, remains an open question. In this work, we aim at directly learning text representations which leverage structured knowledge about entities mentioned in the text. This can be particularly beneficial for downstream tasks which are knowledge-intensive. Our approach utilizes self-attention between words in the text and knowledge graph (KG) entities mentioned in the text. While existing methods require entity-linked data for pre-training, we train using a mention-span masking objective and a candidate ranking objective - which doesn't require any entity-links and only assumes access to an alias table for retrieving candidates, enabling large-scale pre-training. We show that the proposed model learns knowledge-informed text representations that yield improvements on the downstream tasks over existing methods.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#2570 - Thakur 2012
Information extraction from semi-structured and un-structured documents using probabilistic context free grammar inference

Thakur, R.; Jain, S.; Chaudhari, N. S.; Singhai, R.

2012 International Conference on Information Retrieval & Knowledge Management 2012;():273-276

2012

DOI: 10.1109/InfRKM.2012.6204988 · Ref ID: 6240

Large number of research papers are available in the form of un-structured (text) format. Knowledge discovery in un-structured document has been recognized as promising task. These documents are typically formatted for human viewing, which varies widely from document to document. Frequent change in their formatting causes difficulties in constructing a global schema. Thus, discovery of interesting rules from it is a complex and tedious process. Recently, conditional random fields (CRFs) and hand-coded wrappers have been used to label the text (such as Title, Author Name(s), Affiliation, Email, Contact number, etc. in research papers). In this paper we propose a novel hybrid approach to infer grammar rules using alignment similarity and probabilistic context free grammar. It helps in extracting desired information from the document.

Mike voted
brandon voted
Final decision
What was the agreed final decision?

#1020 - Thant 2023
BERT Fine-Tuning the Covid-19 Open Research Dataset for Named Entity Recognition

Thant, S.; Racharak, T.; Andres, F.

Communications in Computer and Information Science 2023;1942 CCIS():261-275

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-981-99-7969-1_19 · Ref ID: 5060

This study employs the widely used Large Language Model (LLM), BERT, to implement Named Entity Recognition (NER) on the CORD-19 biomedical literature corpus. By fine-tuning the pre-trained BERT on the CORD-NER dataset, the model gains the ability to comprehend the context and semantics of biomedical named entities. The refined model is then utilized on the CORD-19 to extract more contextually relevant and updated named entities. However, fine-tuning large datasets with LLMs poses a challenge. To counter this, two distinct sampling methodologies are proposed to apply on each dataset. First, for the NER task on the CORD-19, a Latent Dirichlet Allocation (LDA) topic modeling technique is employed. This maintains the sentence structure while concentrating on related content. Second, a straightforward greedy method is deployed to gather the most informative data of 25 entity types from the CORD-NER dataset. The study realizes its goals by demonstrating the content comprehension capability of BERT-based models without the necessity of supercomputers, and converting the document-level corpus into a source for NER data, enhancing data accessibility. The outcomes of this research can shed light on the potential progression of more sophisticated NLP applications across various sectors, including knowledge graph creation, ontology learning, and conversational AI. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2023.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1748 - Thießen 2023
Probing Large Language Models for Scientific Synonyms

Thießen, F.; D’Souza, J.; Stocker, M.

CEUR Workshop Proceedings 2023;3510():

CEUR-WS 2023

Ref ID: 5198

Purpose: Automatically identifying synonyms is an important but challenging aspect of entity normalization in knowledge graphs. Entity normalization is crucial in ensuring that information in knowledge graphs is well connected and therefore efficiently reusable. We aim to investigate the potential of pre-trained large language models (LLMs) for this task. Methodology: We use k-Means clustering to compare latent concepts learned by LLMs with human-defined scientific synonymy concept clusters sourced from ORKG, CS-KG, SemEval 2017, and SciERC data. We investigate the models BERT, RoBERTa, BART, and OpenAI GPT3 (text-embedding-ada-002 variant) and evaluate clustering results by model layer. Findings: F1 scores average around 0.7 to 0.75 depending on the dataset and layer. The best results are reached using OpenAI GPT3 (max F1=0.914). We further notice no advantage of models trained on scientific data. Value: Our results suggest information learned by transformer models aligns with human-defined scientific synonyms. This shows the potential of information encoded in pre-trained LLMs to be leveraged for synonymy detection. © 2023 Copyright for this paper by its authors.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#1468 - Tian 2024
KG-Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter-Efficient Fine-Tuning

Tian, S.; Luo, Y.; Xu, T.; Yuan, C.; Jiang, H.; Chen, W.; Wang, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():3813-3828

Association for Computational Linguistics (ACL) 2024

Ref ID: 4259

Although large language models (LLMs) show remarkable capabilities and generalizability across various tasks, they are criticized for lack of expertise. One promising solution is to combine knowledge graphs (KGs) with LLMs, and recent studies focus on integrating KGs into LLMs through prompt-based methods. However, these approaches fail to use the structural information of the KGs, suffer from the problem of knowledge conflict, and over-reliance on super LLMs. To address these challenges, we propose KG-Adapter, a parameter-level KG integration method based on parameter-efficient fine-tuning (PEFT). Specifically, we introduce a novel adapter structure designed for decoder-only LLMs, which can encode KGs from both node-centered and relation-centered perspectives, and then perform joint reasoning with LLMs to generate responses end-to-end. Experiments with diverse models on four datasets for two different tasks all demonstrate significant improvements. With only 28M parameters trained, we make the 7B-parameter LLM outperform the previous full-parameter fine-tuned state-of-the-art method and comparable to the prompt-based ChatGPT methods. © 2024 Association for Computational Linguistics.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#648 - Tian 2024
PDEC: A Framework for Improving Knowledge Graph Reasoning Performance through Predicate Decomposition

Tian, X.; Meng, Y.

Algorithms 2024;17(3):17

2024

DOI: 10.3390/a17030129 · Ref ID: 2982

The judicious configuration of predicates is a crucial but often overlooked aspect in the field of knowledge graphs. While previous research has primarily focused on the precision of triples in assessing knowledge graph quality, the rationality of predicates has been largely ignored. This paper introduces an innovative approach aimed at enhancing knowledge graph reasoning by addressing the issue of predicate polysemy. Predicate polysemy refers to instances where a predicate possesses multiple meanings, introducing ambiguity into the knowledge graph. We present an adaptable optimization framework that effectively addresses predicate polysemy, thereby enhancing reasoning capabilities within knowledge graphs. Our approach serves as a versatile and generalized framework applicable to any reasoning model, offering a scalable and flexible solution to enhance performance across various domains and applications. Through rigorous experimental evaluations, we demonstrate the effectiveness and adaptability of our methodology, showing significant improvements in knowledge graph reasoning accuracy. Our findings underscore that discerning predicate polysemy is a crucial step towards achieving a more dependable and efficient knowledge graph reasoning process. Even in the age of large language models, the optimization and induction of predicates remain relevant in ensuring interpretable reasoning.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2598 - Tiwari 2003
Invisible formal methods for embedded control systems

Tiwari, A.; Shankar, N.; Rushby, J.

Proceedings of the IEEE 2003;91(1):29-39

2003

DOI: 10.1109/JPROC.2002.805818 · Ref ID: 6596

Embedded control systems typically comprise continuous control laws combined with discrete mode logic. These systems are modeled using a hybrid automaton formalism, which is obtained by combining the discrete transition system formalism with continuous dynamical systems. This paper develops automated analysis techniques for asserting correctness of hybrid system designs. Our approach is based on symbolic representation of the state space of the system using mathematical formulas in an appropriate logic. Such formulas are manipulated using symbolic theorem proving techniques. It is important that formal analysis should be unobtrusive and acceptable to engineering practice. We motivate a methodology called invisible formal methods that provides a graded sequence of formal analysis technologies ranging from extended typechecking, through approximation and abstraction, to model checking and theorem proving. As an instance of invisible formal methods, we describe techniques to check inductive invariants, or extended types, for hybrid systems and compute discrete finite state abstractions automatically to perform reachability set computation. The abstract system is sound with respect to the formal semantics of hybrid automata. We also discuss techniques for performing analysis on nonstandard semantics of hybrid automata. We also briefly discuss the problem of translating models in Simulink/Stateflow language, which is widely used in practice, into the modeling formalisms, like hybrid automata, for which analysis tools are being developed.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2890 - Todoran 2015
Semantic investigation of a control-flow subset of BPMN 2.0

Todoran, E. N.; Mitrea, P.

2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP) 2015;():483-490

2015

DOI: 10.1109/ICCP.2015.7312707 · Ref ID: 6876

Business Process Model and Notation (BPMN), now at version 2.0.2, provides a standard graphical representation for specifying business processes. In this paper we report on the first stage of a semantic investigation of BPMN, using methods in the tradition of programming languages semantics. We consider a control-flow subset of BPMN and an execution architecture based on an intermediate language that we name ℒBPMN. The execution architecture comprises two main components: a translator which takes as input a BPMN model and generates ℒBPMN code, and an interpreter for ℒBPMN. ℒBPMN is a process oriented imperative language providing a combination of concepts, including maximal parallelism and durational activities. We employ the mathematical methodology of metric semantics in designing and relating an operational semantics O and a denotational semantics D for ℒBPMN. We establish the formal relation between O and D by using an abstraction operator and a fixed point argument. In this way we prove the correctness of the denotational semantics with respect to the operational semantics. We focus on the semantic investigation of BPMN. We also explain how the operational semantics can serve as a blueprint for an implementation on a client-server architecture.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#370 - Toledo 2019
Information extraction from historical handwritten document images with a context-aware neural model

Toledo, J. I.; Carbonell, M.; Fornés, A.; Lladós, J.

Pattern Recognit. 2019;86():27-36

2019

DOI: 10.1016/j.patcog.2018.08.020 · Ref ID: 3731

Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outper-forming the existing approaches. (C) 2018 Elsevier Ltd. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#53 - Tong 2024
Automating psychological hypothesis generation with AI: when large language models meet causal graph

Tong, S.; Mao, K.; Huang, Z.; Zhao, Y. K.; Peng, K. P.

Hum. Soc. Sci. Commun. 2024;11(1):14

2024

DOI: 10.1057/s41599-024-03407-5 · Ref ID: 3199

Leveraging the synergy between causal knowledge graphs and a large language model (LLM), our study introduces a groundbreaking approach for computational hypothesis generation in psychology. We analyzed 43,312 psychology articles using a LLM to extract causal relation pairs. This analysis produced a specialized causal graph for psychology. Applying link prediction algorithms, we generated 130 potential psychological hypotheses focusing on "well-being", then compared them against research ideas conceived by doctoral scholars and those produced solely by the LLM. Interestingly, our combined approach of a LLM and causal graphs mirrored the expert-level insights in terms of novelty, clearly surpassing the LLM-only hypotheses (t(59) = 3.34, p = 0.007 and t(59) = 4.32, p < 0.001, respectively). This alignment was further corroborated using deep semantic analysis. Our results show that combining LLM with machine learning techniques such as causal knowledge graphs can revolutionize automated discovery in psychology, extracting novel insights from the extensive literature. This work stands at the crossroads of psychology and artificial intelligence, championing a new enriched paradigm for data-driven hypothesis generation in psychological research.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3969 - Tong 2024
Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study

Tong, Xu; Smirnova, Nina; Upadhyaya, Sharmila; Yu, Ran; Culbert, Jack H.; Sun, Chao; Otto, Wolfgang; Mayr, Philipp

arXiv 2024;():

2024

Ref ID: 8559

Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed. We then performed NER tasks for the 6 entity types using ChatGPT (GPT-3.5 and GPT-4) and 4 state-of-the-art BERT-based question-answering (QA) models (RoBERTa, MiniLM, PubMedBERT and SciBERT) without prior training on the specific task. A domain fine-tuned model (GSAP-NER) was also applied for a comprehensive comparison. Results: The overall performance of LLMs varied significantly in exact match and fuzzy match. In the fuzzy match, ChatGPT surpassed BERT-based QA models in 5 out of 6 tasks, while in exact match, BERT-based QA models outperformed ChatGPT in 5 out of 6 tasks but with a smaller F-1 difference. GPT-4 showed a significant advantage over other models in fuzzy match, especially on the entity type of TCM formula and the Chinese patent drug (TFD) and ingredient (IG). Although GPT-4 outperformed BERT-based models on entity type of herb, target, and research method, none of the F-1 scores exceeded 0.5. GSAP-NER, outperformed GPT-4 in terms of F-1 by a slight margin on RM. ChatGPT achieved considerably higher recalls than precisions, particularly in the fuzzy match. Conclusions: The NER performance of LLMs is highly dependent on the entity type, and their performance varies across application scenarios. ChatGPT could be a good choice for scenarios where high recall is favored. However, for knowledge acquisition in rigorous scenarios, neither ChatGPT nor BERT-based QA models are off-the-shelf tools for professional practitioners.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2060 - Toro 2024
Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI)

Toro, S.; Anagnostopoulos, A. V.; Bello, S. M.; Blumberg, K.; Cameron, R.; Carmody, L.; Diehl, A. D.; Dooley, D. M.; Duncan, W. D.; Fey, P.; Gaudet, P.; Harris, N. L.; Joachimiak, M. P.; Kiani, L.; Lubiana, T.; Munoz-Torres, M. C.; O'Neil, S.; Osumi-Sutherland, D.; Puig-Barbe, A.; Reese, J. T.; Reiser, L.; Robb, S. M.; Ruemping, T.; Seager, J.; Sid, E.; Stefancsik, R.; Weber, M.; Wood, V.; Haendel, M. A.; Mungall, C. J.

J Biomed Semantics 2024;15(1):19

2024

DOI: 10.1186/s13326-024-00320-3 · Ref ID: 5946

BACKGROUND: Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge in an accurate and computable form. However, their construction and maintenance demand substantial resources and necessitate substantial collaboration between domain experts, curators, and ontology experts. We present Dynamic Retrieval Augmented Generation of Ontologies using AI (DRAGON-AI), an ontology generation method employing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). DRAGON-AI can generate textual and logical ontology components, drawing from existing knowledge in multiple ontologies and unstructured text sources. RESULTS: We assessed performance of DRAGON-AI on de novo term construction across ten diverse ontologies, making use of extensive manual evaluation of results. Our method has high precision for relationship generation, but has slightly lower precision than from logic-based reasoning. Our method is also able to generate definitions deemed acceptable by expert evaluators, but these scored worse than human-authored definitions. Notably, evaluators with the highest level of confidence in a domain were better able to discern flaws in AI-generated definitions. We also demonstrated the ability of DRAGON-AI to incorporate natural language instructions in the form of GitHub issues. CONCLUSIONS: These findings suggest DRAGON-AI's potential to substantially aid the manual ontology construction process. However, our results also underscore the importance of having expert curators and ontology editors drive the ontology generation process.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#3381 - Trajanoska 2023
Enhancing Knowledge Graph Construction Using Large Language Models

Trajanoska, Milena; Stojanov, Riste; Trajanov, Dimitar

arXiv 2023;():

2023

Ref ID: 7694

The growing trend of Large Language Models (LLM) development has attracted significant attention, with models for various applications emerging consistently. However, the combined application of Large Language Models with semantic technologies for reasoning and inference is still a challenging task. This paper analyzes how the current advances in foundational LLM, like ChatGPT, can be compared with the specialized pretrained models, like REBEL, for joint entity and relation extraction. To evaluate this approach, we conducted several experiments using sustainability-related text as our use case. We created pipelines for the automatic creation of Knowledge Graphs from raw texts, and our findings indicate that using advanced LLM models can improve the accuracy of the process of creating these graphs from unstructured text. Furthermore, we explored the potential of automatic ontology creation using foundation LLM models, which resulted in even more relevant and accurate knowledge graphs.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3160 - Tran 2021
SPBERT: an&nbsp;Efficient Pre-training BERT on&nbsp;SPARQL Queries for&nbsp;Question Answering over&nbsp;Knowledge Graphs

Tran, Hieu; Phan, Long; Anibal, James; Nguyen, Binh T.; Nguyen, Truong-Son

Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part I 2021;():512–523

Sanur, Bali, Indonesia Springer-Verlag 2021

DOI: 10.1007/978-3-030-92185-9_42 · Ref ID: 7319

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1257 - Tran 2024
Enhancing Knowledge Retrieval with Topic Modeling for Knowledge-Grounded Dialogue

Tran, N.; Litman, D.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():5986-5995

European Language Resources Association (ELRA) 2024

Ref ID: 4586

Knowledge retrieval is one of the major challenges in building a knowledge-grounded dialogue system. A common method is to use a neural retriever with a distributed approximate nearest-neighbor database to quickly find the relevant knowledge sentences. In this work, we propose an approach that utilizes topic modeling on the knowledge base to further improve retrieval accuracy and as a result, improve response generation. Additionally, we experiment with a large language model, ChatGPT, to take advantage of the improved retrieval performance to further improve the generation results. Experimental results on two datasets show that our approach can increase retrieval and generation performance. The results also indicate that ChatGPT is a better response generator for knowledge-grounded dialogue when relevant knowledge is provided. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#866 - Trappey 2022
Using Machine Learning Language Models to Generate Innovation Knowledge Graphs for Patent Mining

Trappey, A. J. C.; Liang, C. P.; Lin, H. J.

Appl. Sci.-Basel 2022;12(19):19

2022

DOI: 10.3390/app12199818 · Ref ID: 2933

To explore and understand the state-of-the-art innovations in any given domain, researchers often need to study many domain patents and synthesize their knowledge content. This study provides a smart patent knowledge graph generation system, adopting a machine learning (ML) natural language modeling approach, to help researchers grasp the patent knowledge by generating deep knowledge graphs. This research focuses on converting chemical utility patents, consisting of chemistries and chemical processes, into summarized knowledge graphs. The research methods are in two parts, i.e., the visualization of the chemical processes in the chemical patents' most relevant paragraphs and a knowledge graph of any domain-specific collection of patent texts. The ML language modeling algorithms, including ALBERT for text vectorization, Sentence-BERT for sentence classification, and KeyBERT for keyword extraction, are adopted. These models are trained and tested in the case study using 879 chemical patents in the carbon capture domain. The results demonstrate that the average retention rate of the summary graphs for five clustered patent texts exceeds 80%. The proposed approach is novel and proven to be reliable in graphical deep knowledge representation.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#3265 - Trofimova 2024
CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers

Trofimova, Ekaterina; Sataev, Emil; Jowhari, Abhijit Singh

arXiv 2024;():

2024

Ref ID: 8557

This paper presents CodeRefine, a novel framework for automatically transforming research paper methodologies into functional code using Large Language Models (LLMs). Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph using a predefined ontology. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach. CodeRefine addresses the challenge of bridging theoretical research and practical implementation, offering a more accurate alternative to LLM zero-shot prompting. Evaluations on diverse scientific papers demonstrate CodeRefine's ability to improve code implementation from the paper, potentially accelerating the adoption of cutting-edge algorithms in real-world applications.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1266 - Tsaneva 2024
Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop

Tsaneva, S.; Dessì, D.; Osborne, F.; Sabou, M.

CEUR Workshop Proceedings 2024;3780():

CEUR-WS 2024

Ref ID: 4121

Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a comprehensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in-the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%). © 2022 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1612 - Tsaneva 2024
LLM-driven Ontology Evaluation: Verifying Ontology Restrictions with ChatGPT

Tsaneva, S.; Vasic, S.; Sabou, M.

CEUR Workshop Proceedings 2024;3747():15

CEUR-WS 2024

Ref ID: 4382

Recent advancements in artificial intelligence, particularly in large language models (LLMs), have sparked interest in their application to knowledge engineering (KE) tasks. While existing research has primarily explored the utilisation of LLMs for constructing and completing semantic resources such as ontologies and knowledge graphs, the evaluation of these resources-addressing quality issues- has not yet been thoroughly investigated. To address this gap, we propose an LLM-driven approach for the verification of ontology restrictions. We replicate our previously conducted human-in-the-loop experiment using ChatGPT-4 instead of human contributors to assess whether comparable ontology verification results can be obtained. We find that (1) ChatGPT-4 achieves intermediate-to-expert scores on an ontology modelling qualification test; (2) the model performs ontology restriction verification with accuracy of 92.22%; (3) combining model answers on the same ontology axiom represented in different formalisms improves the accuracy to 96.67%; and (4) higher accuracy is observed in identifying defects related to the incompleteness of ontology axioms compared to errors due to restrictions misuse. Our results highlight the potential of LLMs in supporting knowledge engineering tasks and outline future research directions in the area. © 2024 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3331 - Tu 2024
DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning

Tu, Shangqing; Zhu, Kejian; Bai, Yushi; Yao, Zijun; Hou, Lei; Li, Juanzi

arXiv 2024;():

2024

Ref ID: 8361

The advancement of large language models (LLMs) relies on evaluation using public benchmarks, but data contamination can lead to overestimated performance. Previous researches focus on detecting contamination by determining whether the model has seen the exact same data during training. Besides, prior work has already shown that even training on data similar to benchmark data inflates performance, namely \emph{In-distribution contamination}. In this work, we argue that in-distribution contamination can lead to the performance drop on OOD benchmarks. To effectively detect in-distribution contamination, we propose DICE, a novel method that leverages the internal states of LLMs to locate-then-detect the contamination. DICE first identifies the most sensitive layer to contamination, then trains a classifier based on the internal states of that layer. Experiments reveal DICE's high accuracy in detecting in-distribution contamination across various LLMs and math reasoning datasets. We also show the generalization capability of the trained DICE detector, which is able to detect contamination across multiple benchmarks with similar distributions. Additionally, we find that DICE's predictions correlate with the performance of LLMs fine-tuned by either us or other organizations, achieving a coefficient of determination ($R^2$) between 0.61 and 0.75. The code and data are available at https://github.com/THU-KEG/DICE.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#3660 - Tulchinskii 2024
Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA

Tulchinskii, Eduard; Kushnareva, Laida; Kuznetsov, Kristian; Voznyuk, Anastasia; Andriiainen, Andrei; Piontkovskaya, Irina; Burnaev, Evgeny; Barannikov, Serguei

arXiv 2024;():

2024

Ref ID: 8653

A standard way to evaluate the abilities of LLM involves presenting a multiple-choice question and selecting the option with the highest logit as the model's predicted answer. However, such a format for evaluating LLMs has limitations, since even if the model knows the correct answer, it may struggle to select the corresponding letter simply due to difficulties in following this rigid format. To address this, we introduce new scores that better capture and reveal model's underlying knowledge: the Query-Key Score (QK-score), derived from the interaction between query and key representations in attention heads, and the Attention Score, based on attention weights. These scores are extracted from specific ??????-???-???? heads, which show consistent performance across popular Multi-Choice Question Answering (MCQA) datasets. Based on these scores, our method improves knowledge extraction, yielding up to 16% gain for LLaMA2-7B and up to 10% for larger models on popular MCQA benchmarks. At the same time, the accuracy on a simple synthetic dataset, where the model explicitly knows the right answer, increases by almost 60%, achieving nearly perfect accuracy, therefore demonstrating the method's efficiency in mitigating MCQA format limitations. To support our claims, we conduct experiments on models ranging from 7 billion to 70 billion parameters in both zero- and few-shot setups.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1654 - Tuozzo 2024
Moving from Tabular Knowledge Graph Quality Assessment to RDF Triples Leveraging ChatGPT

Tuozzo, G.

CEUR Workshop Proceedings 2024;3747():9

CEUR-WS 2024

Ref ID: 4343

Data quality assessment is a multifaceted challenge involving various dimensions such as accessibility, interlinking, and completeness. These dimensions are domain-dependent and can be aggregated into a score between 0 and 1, facilitating dataset ranking based on quality. Achieving effective representation and explanation of these rankings poses significant challenges akin to those in machine learning, where interpretability and understandability are crucial. In the domain of natural language processing, data interpretation is a critical yet complex process, often requiring domain expertise and significant resources. Advanced Language Model Models (LLMs) offer promise in automating annotation tasks, ensuring consistency, and adapting to specific domains. Leveraging such models for knowledge representation tasks necessitates adept prompt engineering. This study focuses on experiencing state-of-the-art prompt engineering methods, particularly using GPT-3.5, for representing knowledge related to dataset quality. By exploring techniques to extract RDF triples from textual data without predefined labels or constraints, this work aims to enhance interpretability and understanding of dataset quality assessment results while verifying the feasibility on automatic knowledge representation leveraging LLMs. © 2022 Copyright for this paper by its authors.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#833 - Tupayachi 2024
Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models-A Case in Optimizing Intermodal Freight Transportation

Tupayachi, J.; Xu, H. W.; Omitaomu, O. A.; Camur, M. C.; Sharmin, A.; Li, X. P.

Smart Cities 2024;7(5):2392-2421

2024

DOI: 10.3390/smartcities7050094 · Ref ID: 3192

Highlights What are the main findings? We have developed an integrated and automated methodology that leverages a pre-trained Large Language Model (LLM) to generate scenario-based ontologies and knowledge graphs from research articles and technical manuals. Our methodology utilizes the ChatGPT API as the primary reasoning engine, supplemented by Natural Language Processing modules and carefully engineered prompts. This combination enables an automated tool capable of generating ontologies independently. The ontologies generated through our AI-powered method are interoperable and can significantly facilitate the design of data models and software architecture, particularly in the development of urban decision support systems. What is the implication of the main finding? We compared ontologies generated by our LLM with those created by human experts through CQ-based qualitative evaluation, assessing the reliability and feasibility of our approach. The methodology has been successfully applied to intermodal freight data and simulations. This has allowed us to generate a scenario-based ontology and knowledge graph that enhances data discovery, integration, and management, thereby supporting network optimization and multiple criteria decision analysis. Our methodology is both generalizable and adaptive, enabling the automation of ontology generation to support the development of urban and environmental decision support systems across various disciplines.Highlights What are the main findings? We have developed an integrated and automated methodology that leverages a pre-trained Large Language Model (LLM) to generate scenario-based ontologies and knowledge graphs from research articles and technical manuals. Our methodology utilizes the ChatGPT API as the primary reasoning engine, supplemented by Natural Language Processing modules and carefully engineered prompts. This combination enables an automated tool capable of generating ontologies independently. The ontologies generated through our AI-powered method are interoperable and can significantly facilitate the design of data models and software architecture, particularly in the development of urban decision support systems. What is the implication of the main finding? We compared ontologies generated by our LLM with those created by human experts through CQ-based qualitative evaluation, assessing the reliability and feasibility of our approach. The methodology has been successfully applied to intermodal freight data and simulations. This has allowed us to generate a scenario-based ontology and knowledge graph that enhances data discovery, integration, and management, thereby supporting network optimization and multiple criteria decision analysis. Our methodology is both generalizable and adaptive, enabling the automation of ontology generation to support the development of urban and environmental decision support systems across various disciplines.Abstract The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management challenges often demands deep expertise in domain science and informatics. This expertise is essential for deriving data and simulation-driven insights that support informed decision-making. In this context, we investigate the potential of leveraging the pre-trained Large Language Models (LLMs) to create knowledge representations for supporting operations research. By adopting ChatGPT-4 API as the reasoning core, we outline an applied workflow that encompasses natural language processing, Methontology-based prompt tuning, and Generative Pre-trained Transformer (GPT), to automate the construction of scenario-based ontologies using existing research articles and technical manuals of urban datasets and simulations. From these ontologies, knowledge graphs can be derived using widely adopted formats and protocols, guiding various tasks towards data-informed decision support. The performance of our methodology is evaluated through a comparative analysis that contrasts our AI-generated ontology with the widely recognized pizza ontology, commonly used in tutorials for popular ontology software. We conclude with a real-world case study on optimizing the complex system of multi-modal freight transportation. Our approach advances urban decision support systems by enhancing data and metadata modeling, improving data integration and simulation coupling, and guiding the development of decision support strategies and essential software components.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#548 - Uma 2023
Masking Language Model Mechanism with Event-Driven Knowledge Graphs for Temporal Relations Extraction from Clinical Narratives

Uma, K.; Francis, S.; Moens, M. F.

12th International Conference on Complex Networks and their Applications (COMPLEX NETWORKS) 2023;1141():162-174

Menton, FRANCE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-53468-3_14 · Ref ID: 3524

For many natural language processing systems, the extraction of temporal links and associations from clinical narratives has been a critical challenge. To understand such processes, we must be aware of the occurrences of events and their time or temporal aspect by constructing a chronology for the sequence of events. The primary objective of temporal relation extraction is to identify relationships and correlations between entities, events, and expressions. We propose a novel architecture leveraging Transformer based graph neural network by combining textual data with event graph embeddings for predicting temporal links across events, entities, document creation time and expressions. We demonstrate our preliminary findings on i2b2 temporal relations corpus for predicting BEFORE, AFTER and OVERLAP links with event graph for correct set of relations. Comparison with various Biomedical-BERT embedding types were benchmarked yielding best performance on PubMed BERT with language model masking (LMM) mechanism on our methodology. This illustrates the effectiveness of our proposed strategy.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#1706 - vanCauter 2024
Ontology-guided Knowledge Graph Construction from Maintenance Short Texts

van Cauter, Z.; Yakovets, N.

KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():75-84

Association for Computational Linguistics (ACL) 2024

Ref ID: 4315

Large-scale knowledge graph construction remains infeasible since it requires significant human-expert involvement. Further complications arise when building graphs from domain-specific data due to their unique vocabularies and associated contexts. In this work, we demonstrate the ability of open-source large language models (LLMs), such as Llama-2 and Llama-3, to extract facts from domain-specific Maintenance Short Texts (MSTs). We employ an approach which combines ontology-guided triplet extraction and in-context learning. By using only 20 semantically similar examples with the Llama-3-70B-Instruct model, we achieve performance comparable to previous methods that relied on fine-tuning techniques like SpERT and REBEL. This indicates that domain-specific fact extraction can be accomplished through inference alone, requiring minimal labeled data. This opens up possibilities for effective and efficient semi-automated knowledge graph construction for domain-specific data. ©2024 Association for Computational Linguistics.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3853 - Van 2024
Rx Strategist: Prescription Verification using LLM Agents System

Van, Phuc Phan; Minh, Dat Nguyen; Ngoc, An Dinh; Thanh, Huy Phan

arXiv 2024;():

2024

Ref ID: 8580

To protect patient safety, modern pharmaceutical complexity demands strict prescription verification. We offer a new approach - Rx Strategist - that makes use of knowledge graphs and different search strategies to enhance the power of Large Language Models (LLMs) inside an agentic framework. This multifaceted technique allows for a multi-stage LLM pipeline and reliable information retrieval from a custom-built active ingredient database. Different facets of prescription verification, such as indication, dose, and possible drug interactions, are covered in each stage of the pipeline. We alleviate the drawbacks of monolithic LLM techniques by spreading reasoning over these stages, improving correctness and reliability while reducing memory demands. Our findings demonstrate that Rx Strategist surpasses many current LLMs, achieving performance comparable to that of a highly experienced clinical pharmacist. In the complicated world of modern medications, this combination of LLMs with organized knowledge and sophisticated search methods presents a viable avenue for reducing prescription errors and enhancing patient outcomes.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2102 - Varadarajan 2015
Affordance and k-TR Augmented Alphabet based Neuro-Symbolic language — Af-kTRAANS — A Human-Robot Interaction meta-language

Varadarajan, K. M.; Vincze, M.

2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR) 2015;():394-399

2015

DOI: 10.1109/MMAR.2015.7283908 · Ref ID: 6092

Human-Robot Interaction (HRI) and Inter-Robot Communication (ICI) are rapidly evolving fields with little standardization. A number of middleware architectures, frameworks and programming languages exist for implementing algorithms on robots. Also, efforts have been made to enable robots to understand the multitude of natural languages available. Nevertheless, there is definite lack of intermediary languages for the representation of symbol grounding mechanisms in robots and standards for inter-robot cognitive communication. We address this void by presenting an intermediary meta-language based on a perceptually grounded algorithmic alphabet - the Affordance and kTR Augmented Alphabet based Neuro-Symbolic language, in short Af-kTRAANS, yielding an abstract layer sandwiched between the natural and the programming language layers that robots can use for knowledge representation, sharing and communication, while being agnostic to the embodiment, the pertinent human language, as well as, socio-cultural contexts and environments. Based on the k-TR theory of cognitive visual perception and implemented for practical systems using the Affordance Network (AfNet) and the AfRob ontology, the graphical language can support a wide variety of object definition phrases as well as action verbs/ object interaction commands while providing the necessary succinctness for tractable modeling. The various aspects of this cognitive inter-robot communication language are presented in this paper. Several examples of usage of the graphical language for common robotic task based queries are demonstrated along with the grounding mechanisms.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#427 - Varshney 2023
Knowledge graph assisted end-to-end medical dialog generation

Varshney, D.; Zafar, A.; Behera, N. K.; Ekbal, A.

Artif. Intell. Med. 2023;139():10

2023

DOI: 10.1016/j.artmed.2023.102535 · Ref ID: 3087

Medical dialog systems have the potential to assist e-medicine in improving access to healthcare services, improving patient treatment quality, and lowering medical expenses. In this research, we describe a knowledge -grounded conversation generation model that demonstrates how large-scale medical information in the form of knowledge graphs can aid in language comprehension and generation in medical dialog systems. Generic responses are often produced by existing generative dialog systems, resulting in monotonous and uninteresting conversations. To solve this problem, we combine various pre-trained language models with a medical knowledge base (UMLS) to generate clinically correct and human-like medical conversations using the recently released MedDialog-EN dataset. The medical-specific knowledge graph contains broadly 3 types of medical-related information, including disease, symptom and laboratory test. We perform reasoning over the retrieved knowledge graph by reading the triples in each graph using MedFact attention, which allows us to use semantic information from the graphs for better response generation. In order to preserve medical information, we employ a policy network, which effectively injects relevant entities associated with each dialog into the response. We also study how transfer learning can significantly improve the performance by utilizing a relatively small corpus, created by extending the recently released CovidDialog dataset, containing the dialogs for diseases that are symptoms of Covid-19. Empirical results on the MedDialog corpus and the extended CovidDialog dataset demonstrate that our proposed model significantly outperforms the state-of-the-art methods in terms of both automatic evaluation and human judgment.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3531 - Vasisht 2024
Infusing Knowledge into Large Language Models with Contextual Prompts

Vasisht, Kinshuk; Ganesan, Balaji; Kumar, Vikas; Bhatnagar, Vasudha

arXiv 2024;():

2024

Ref ID: 8155

Knowledge infusion is a promising method for enhancing Large Language Models for domain-specific NLP tasks rather than pre-training models over large data from scratch. These augmented LLMs typically depend on additional pre-training or knowledge prompts from an existing knowledge graph, which is impractical in many applications. In contrast, knowledge infusion directly from relevant documents is more generalisable and alleviates the need for structured knowledge graphs while also being useful for entities that are usually not found in any knowledge graph. With this motivation, we propose a simple yet generalisable approach for knowledge infusion by generating prompts from the context in the input text. Our experiments show the effectiveness of our approach which we evaluate by probing the fine-tuned LLMs.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#2620 - Vassev 2012
Knowledge representation with KnowLang the marXbot case study

Vassev, E.; Hinchey, M.

2012 IEEE 11th International Conference on Cybernetic Intelligent Systems (CIS) 2012;():18-23

2012

DOI: 10.1109/CIS.2013.6782155 · Ref ID: 6051

Intelligent systems are capable of AI exhibited via knowledge representation and reasoning, which helps to connect abstract knowledge symbols to real-world meanings. This paper presents a formal language for knowledge representation called KnowLang. The language implies a multi-tier specification model emphasizing knowledge corpuses, knowledge base operators and inference primitives. The approach allows for efficient and comprehensive knowledge structuring where ontologies are integrated with rules and Bayesian networks. The paper presents the KnowLang specification constructs formally along with a case study based on a mobile robotics platform.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1864 - Venkatakrishnan 2024
Semantic interlinking of Immigration Data using LLMs for Knowledge Graph Construction

Venkatakrishnan, R.; Tanyildizi, E.; Canbaz, M. A.

WWW 2024 Companion - Companion Proceedings of the ACM Web Conference 2024;():605-608

Association for Computing Machinery, Inc 2024

DOI: 10.1145/3589335.3651557 · Ref ID: 4075

The challenge of managing immigration data is exacerbated by its reliance on paper-based, evidence-driven records maintained by legal professionals, creating obstacles for efficient processing and analysis due to inherent trust issues with AI-based systems. This paper introduces a cutting-edge framework to surmount these hurdles by synergizing Large Language Models (LLMs) with Knowledge Graphs (KGs), revolutionizing traditional data handling methods. Our method transforms archaic, paper-based immigration records into a structured, interconnected knowledge network that intricately mirrors the legal and procedural nuances of immigration, ensuring a dynamic and trustworthy platform for data analysis. Utilizing LLMs, we extract vital entities and relationships from diverse legal documents to forge a comprehensive knowledge graph, encapsulating the complex legalities and procedural disparities in immigration processes and mapping the multifaceted interactions among stakeholders like applicants, sponsors, and legal experts. This graph not only facilitates a deep dive into the legal stipulations but also incorporates them, significantly boosting the system’s reliability and precision. With the integration of Retrieval Augmented Generation (RAG) for exact, context-aware data retrieval and Augmented Knowledge Creation for developing a conversational interface via LLMs, our framework offers a scalable, adaptable solution to immigration data management. This innovative amalgamation of LLMs, KGs, and RAG techniques marks a paradigm shift towards more informed, efficient, and trustworthy decision-making in the sphere of global migration, setting a new benchmark for legal technology and data source management. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3730 - Vijay 2022
NERDA-Con: Extending NER models for Continual Learning – Integrating Distinct Tasks and Updating Distribution Shifts

Vijay, Supriti; Priyanshu, Aman

arXiv 2022;():

2022

Ref ID: 7562

With increasing applications in areas such as biomedical information extraction pipelines and social media analytics, Named Entity Recognition (NER) has become an indispensable tool for knowledge extraction. However, with the gradual shift in language structure and vocabulary, NERs are plagued with distribution shifts, making them redundant or not as profitable without re-training. Re-training NERs based on Large Language Models (LLMs) from scratch over newly acquired data poses economic disadvantages. In contrast, re-training only with newly acquired data will result in Catastrophic Forgetting of previously acquired knowledge. Therefore, we propose NERDA-Con, a pipeline for training NERs with LLM bases by incorporating the concept of Elastic Weight Consolidation (EWC) into the NER fine-tuning NERDA pipeline. As we believe our work has implications to be utilized in the pipeline of continual learning and NER, we open-source our code as well as provide the fine-tuning library of the same name NERDA-Con at https://github.com/SupritiVijay/NERDA-Con and https://pypi.org/project/NERDA-Con/.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#716 - Vizcarra 2024
Representing the Interaction between Users and Products via LLM-assisted Knowledge Graph Construction

Vizcarra, J.; Haruta, S.; Kurokawa, M.; Ieee

18th IEEE International Conference on Semantic Computing (ICSC) 2024;():231-232

Laguna Hills, CA Ieee Computer Soc 2024

DOI: 10.1109/icsc59802.2024.00043 · Ref ID: 2985

To understand user behavior, representing the semantic knowledge of user-product interaction is essential. In this paper, we represent the interaction between user and product via large language model (LLM)-assisted knowledge graph construction. We capture users' behavioral actions and static properties of the products from raw text data of "user review" and "product catalog". Moreover, the information needed for updating the knowledge graph is captured by raw texts of "news related to the products". The proposed methodology integrates them as a single knowledge graph to provide causal reasoning on user-product interaction. To alleviate the situation where a small quantity of annotated text exists in these data, we use LLM as a data annotator and augmentor.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3852 - Vogt 2024
Rosetta Statements: Lowering the Barrier for Semantic Parsing and Increasing the Cognitive Interoperability of Knowledge Graphs

Vogt, Lars; Konrad, Marcel; Farfar, Kheir Eddine; Prinz, Manuel; Oelen, Allard

arXiv 2024;():

2024

Ref ID: 8492

Machines need data and metadata to be machine-actionable and FAIR (findable, accessible, interoperable, reusable) to manage increasing data volumes. Knowledge graphs and ontologies are key to this, but their use is hampered by high access barriers due to required prior knowledge in semantics and data modelling. The Rosetta Statement approach proposes modeling English natural language statements instead of a mind-independent reality. We propose a metamodel for creating semantic schema patterns for simple statement types. The approach supports versioning of statements and provides a detailed editing history. Each Rosetta Statement pattern has a dynamic label for displaying statements as natural language sentences. Implemented in the Open Research Knowledge Graph (ORKG) as a use case, this approach allows domain experts to define data schema patterns without needing semantic knowledge. Future plans include combining Rosetta Statements with semantic units to organize ORKG into meaningful subgraphs, improving usability. A search interface for querying statements without needing SPARQL or Cypher knowledge is also planned, along with tools for data entry and display using Large Language Models and NLP. The Rosetta Statement metamodel supports a two-step knowledge graph construction procedure. Domain experts can model semantic content without support from ontology engineers, lowering entry barriers and increasing cognitive interoperability. The second level involves developing semantic graph patterns for reasoning, requiring collaboration with ontology engineers.

Mike voted
brandon voted
Final decision
What was the agreed final decision?

#668 - Vulie 2020
Probing Pretrained Language Models for Lexical Semantics

Vulie, I.; Ponti, E. M.; Litschko, R.; Glava, G.; Korhonen, A.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():7222-7240

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3648

The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. While prior research focused on morphosyntactic, semantic, and world knowledge, it remains unclear to which extent LMs also derive lexical type-level knowledge from words in context. In this work, we present a systematic empirical analysis across six typologically diverse languages and five different lexical tasks, addressing the following questions: 1) How do different lexical knowledge extraction strategies (monolingual versus multilingual source LM, out-of-context versus in-context encoding, inclusion of special tokens, and layer-wise averaging) impact performance? How consistent are the observed effects across tasks and languages? 2) Is lexical knowledge stored in few parameters, or is it scattered throughout the network? 3) How do these representations fare against traditional static word vectors in lexical tasks? 4) Does the lexical information emerging from independently trained monolingual LMs display latent similarities? Our main results indicate patterns and best practices that hold universally, but also point to prominent variations across languages and tasks. Moreover, we validate the claim that lower Transformer layers carry more type-level lexical knowledge, but also show that this knowledge is distributed across multiple layers.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3339 - Wadhwa 2024
Distilling Event Sequence Knowledge From Large Language Models

Wadhwa, Somin; Hassanzadeh, Oktie; Bhattacharjya, Debarun; Barker, Ken; Ni, Jian

arXiv 2024;():

2024

Ref ID: 8032

Event sequence models have been found to be highly effective in the analysis and prediction of events. Building such models requires availability of abundant high-quality event sequence data. In certain applications, however, clean structured event sequences are not available, and automated sequence extraction results in data that is too noisy and incomplete. In this work, we explore the use of Large Language Models (LLMs) to generate event sequences that can effectively be used for probabilistic event model construction. This can be viewed as a mechanism of distilling event sequence knowledge from LLMs. Our approach relies on a Knowledge Graph (KG) of event concepts with partial causal relations to guide the generative language model for causal event sequence generation. We show that our approach can generate high-quality event sequences, filling a knowledge gap in the input KG. Furthermore, we explore how the generated sequences can be leveraged to discover useful and more complex structured knowledge from pattern mining and probabilistic event models. We release our sequence generation code and evaluation framework, as well as corpus of event sequence data.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1000 - Wan 2024
Aspect-Based Sentiment Classification Model Based on Multi-view Information Fusion

Wan, Y.; Cai, T.; Li, Y.; Ju, S.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14883 LNCS():16-28

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-981-97-7707-5_2 · Ref ID: 4287

Aspect-based sentiment classification is one of the hot tasks in the field of natural language processing. The task aims to judge the sentiment polarity of the target word, also known as the aspect term, specified in the sentence. The current mainstream models aggregate the information of the aspect term neighbor nodes through the graph neural network model to judge the sentiment polarity. Compared with the previous research, this method has achieved obvious results, but it still faces some problems. First of all, the limited scale of the existing public data set constrains the training of the model, and the general knowledge representation ability has certain deficiencies. Secondly, existing methods use single-view information to judge sentiment polarity, but lack multi-view information and corresponding information fusion methods, the complementarity of sentiment feature information from different perspectives has not been studied. To solve the above problems, an aspect-based sentiment classification model based on multi-view information fusion is proposed. By constructing an inference result set from the large language model (LLM), the LLM’s results are used to enhance the model’s knowledge representation ability. A multi-view information fusion module is proposed to integrate information from two aspects: local fusion and global fusion, and make full use of information from different angles. The experimental results show that the model has higher classification ability than the current mainstream models, and the effectiveness of each module of the model is verified by a variety of experiments. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1894 - Wang 2024
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge

Wang, A.; Wu, B.; Chen, S.; Chen, Z.; Guan, H.; Lee, W. N.; Li, L. E.; Gan, C.

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2024;():13384-13394

IEEE Computer Society 2024

DOI: 10.1109/CVPR52733.2024.01271 · Ref ID: 4524

Learning commonsense reasoning from visual contexts and scenes in real-world is a crucial step toward advanced artificial intelligence. However, existing video reasoning benchmarks are still inadequate since they were mainly designed for factual or situated reasoning and rarely involve broader knowledge in the real world. Our work aims to delve deeper into reasoning evaluations, specifically within dynamic, open-world, and structured context knowledge. We propose a new benchmark (SOK-Bench), consisting of 44K questions and 10K situations with instance-level annotations depicted in the videos. The reasoning process is required to understand and apply situated knowledge and general knowledge for problem-solving. To create such a dataset, we propose an automatic and scalable gener-ation method to generate question-answer pairs, knowledge graphs, and rationales by instructing the combinations of LLMs and MLLMs. Concretely, we first extract observable situated entities, relations, and processes from videos for situated knowledge and then extend to open-world knowledge beyond the visible content. The task generation is facilitated through multiple dialogues as iterations and subsequently corrected and refined by our designed self-promptings and demonstrations. With a corpus of both explicit situated facts and implicit commonsense, we generate associated question-answer pairs and reasoning processes, finally followed by manual reviews for quality assurance. We evaluated recent mainstream large vision-language models on the benchmark and found several in-sightful conclusions. For more information, please refer to our benchmark at www.bobbywu.com/SOKBench. © 2024 IEEE.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3527 - Wang 2020
Inductive Learning on Commonsense Knowledge Graph Completion

Wang, Bin; Wang, Guangtao; Huang, Jing; You, Jiaxuan; Leskovec, Jure; Kuo, C. C. Jay

arXiv 2020;():

2020

Ref ID: 7407

Commonsense knowledge graph (CKG) is a special type of knowledge graph (KG), where entities are composed of free-form text. However, most existing CKG completion methods focus on the setting where all the entities are presented at training time. Although this setting is standard for conventional KG completion, it has limitations for CKG completion. At test time, entities in CKGs can be unseen because they may have unseen text/names and entities may be disconnected from the training graph, since CKGs are generally very sparse. Here, we propose to study the inductive learning setting for CKG completion where unseen entities may present at test time. We develop a novel learning framework named InductivE. Different from previous approaches, InductiveE ensures the inductive learning capability by directly computing entity embeddings from raw entity attributes/text. InductiveE consists of a free-text encoder, a graph encoder, and a KG completion decoder. Specifically, the free-text encoder first extracts the textual representation of each entity based on the pre-trained language model and word embedding. The graph encoder is a gated relational graph convolutional neural network that learns from a densified graph for more informative entity representation learning. We develop a method that densifies CKGs by adding edges among semantic-related entities and provide more supportive information for unseen entities, leading to better generalization ability of entity embedding for unseen entities. Finally, inductiveE employs Conv-TransE as the CKG completion decoder. Experimental results show that InductiveE significantly outperforms state-of-the-art baselines in both standard and inductive settings on ATOMIC and ConceptNet benchmarks. InductivE performs especially well on inductive scenarios where it achieves above 48% improvement over present methods.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#458 - Wang 2023
Knowledge Graphs Enhanced Large Language Model Prompt for Electric Power Question Answering

Wang, C.; Hua, M.; Song, J. L.; Tang, X. S.; Assoc Computing, Machinery

7th International Conference on Electronic Information Technology and Computer Engineering (EITCE) 2023;():24-29

Xiamen, PEOPLES R CHINA Assoc Computing Machinery 2023

DOI: 10.1145/3650400.3650405 · Ref ID: 2925

With the continuous development and digital transformation in the field of electric power, the application of large language models in the electric power industry has become a remarkable trend. The electric power industry is an information-intensive domain involving extensive data processing, predictive analysis, and decision-making. Therefore, the application of large language models in the electric power sector is of great significance. Current large language models such as GPT3.5 and GLM can perform well in tasks such as question answering dialogues. However, these models still face challenges such as answer hallucination and inaccurate responses. This paper proposes a method to enhance question answering in large language models using knowledge graphs, aiming to improve the accuracy and reliability of these models in question answering tasks in the electric power domain. The proposed method first utilizes local electric power data to extract triplets and generate a question answering dataset specific to the electric power domain using a large language model. Then, the relationships of the knowledge graph triplets are incorporated into the question prompt to enhance the quality of the model's answers. Furthermore, we fine-tune the large language model using the expanded question set derived from the triplets as knowledge enhanced data. Subsequently, we conduct experiments on both an electric power question answering dataset and a knowledge graph question answering dataset. The experimental results demonstrate that our method significantly improves various metrics of the large language model in the electric power question answering task. This research provides new insights and approaches to enhance the effectiveness of question answering systems in the electric power domain. Future studies can further explore and optimize this prompt expansion method for application in broader domains and tasks.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1694 - Wang 2021
Novel semantic retrieval approach for semi-structured knowledge in industrial software development

Wang, C.; Jiang, Z.; Wang, F.; Ji, Y.; Jiang, H.

Jisuanji Jicheng Zhizao Xitong 2021;27(8):2371-2381

2021

DOI: 10.13196/j.cims.2021.08.019 · Ref ID: 5506

In knowledge-driven industrial software development, assisting engineers in searching heterogeneous semi-structured knowledge efficiently and accurately is a major issue. A semantic retrieval method was proposed based on the knowledge super network model. The knowledge super network consisting of product subnet, object subnet, and knowledge subnet was built with the relations between the concepts of code reuse and the attributes of engineering knowledge. To calculate the process context correlation between user query and engineering knowledge, the conceptual knowledge and language model were integrated by Bayesian method. Experimental results on Microsoft knowledge base dataset show that the proposed approach could improve the precision of knowledge retrieval comparing to several semantic retrieval methods. The feasibility and effectiveness of the approach were also verified. © 2021, Editorial Department of CIMS. All right reserved.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#83 - Wang 2021
Can Generative Pre-trained Language Models Serve as Knowledge Bases for Closed-book QA?

Wang, C. X.; Liu, P.; Zhang, Y.; Assoc Computat, Linguist

Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():3241-3251

Electr Network Assoc Computational Linguistics-Acl 2021

Ref ID: 3546

Recent work has investigated the interesting question using pre-trained language models (PLMs) as knowledge bases for answering open questions. However, existing work is limited in using small benchmarks with high test-train overlaps. We construct a new dataset of closed-book QA using SQuAD, and investigate the performance of BART. Experiments show that it is challenging for BART to remember training facts in high precision, and also challenging to answer closed-book questions even if relevant knowledge is retained. Some promising directions are found, including decoupling the knowledge memorizing process and the QA finetune process, forcing the model to recall relevant knowledge when question answering.

Xinchen voted
Ishan voted
Final decision
What was the agreed final decision?

#517 - Wang 2024
Let Me Show You Step by Step: An Interpretable Graph Routing Network for Knowledge-based Visual Question Answering

Wang, D. K.; Hu, L. M.; Hao, R.; Shao, Y. X.; Lv, X.; Nie, L. Q.; Li, J. Z.; Assoc Computing, Machinery

47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():1984-1994

Washington, DC Assoc Computing Machinery 2024

DOI: 10.1145/3626772.3657790 · Ref ID: 3424

Visual Question Answering based on external Knowledge Bases (KB-VQA) requires a model to incorporate knowledge beyond the content of given image and question for answer prediction. Most existing works made efforts on using graph neural networks or Multi-modal Large Language Models to incorporate external knowledge for answer generation. Despite the promising results, they have limited interpretability and exhibit a deficiency in handling questions with unseen answers. In this paper, we propose a novel interpretable graph routing network (GRN) which explicitly conducts entity routing over a constructed scene knowledge graph step by step for KB-VQA. At each step, GRN keeps an entity score vector representing how likely of each entity to be activated as the answer, and a transition matrix representing the transition probability from one entity to another. To answer the given question, GRN will focus on certain keywords of the question at each step and correspondingly conduct entity routing by transiting the entity scores according to the transition matrix computed referring to the focused question keywords. In this way, it clearly provides the reasoning process of KB-VQA and can handle the questions with unseen answers without distinction. Experiments on the benchmark dataset KRVQA have demonstrated that GRN improves the performance of KB-VQA by a large margin, surpassing existing state-of-the art KB-VQA methods and Multi-modal Large Language Models, as well as shows competent capability in handling unseen answers and good interpretability in KB-VQA.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3194 - Wang 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

Wang, Fei; Wan, Xingchen; Sun, Ruoxi; Chen, Jiefeng; Arık, Sercan Ö

arXiv 2024;():

2024

Ref ID: 8678

Retrieval-Augmented Generation (RAG), while effective in integrating external knowledge to address the limitations of large language models (LLMs), can be undermined by imperfect retrieval, which may introduce irrelevant, misleading, or even malicious information. Despite its importance, previous studies have rarely explored the behavior of RAG through joint analysis on how errors from imperfect retrieval attribute and propagate, and how potential conflicts arise between the LLMs' internal knowledge and external sources. We find that imperfect retrieval augmentation might be inevitable and quite harmful, through controlled analysis under realistic conditions. We identify the knowledge conflicts between LLM-internal and external knowledge from retrieval as a bottleneck to overcome in the post-retrieval stage of RAG. To render LLMs resilient to imperfect retrieval, we propose Astute RAG, a novel RAG approach that adaptively elicits essential information from LLMs' internal knowledge, iteratively consolidates internal and external knowledge with source-awareness, and finalizes the answer according to information reliability. Our experiments using Gemini and Claude demonstrate that Astute RAG significantly outperforms previous robustness-enhanced RAG methods. Notably, Astute RAG is the only approach that matches or exceeds the performance of LLMs without RAG under worst-case scenarios. Further analysis reveals that Astute RAG effectively resolves knowledge conflicts, improving the reliability and trustworthiness of RAG systems.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1786 - Wang 2024
A Real-Time Rumor Detection Method Based on the Graph Attention Neural Network Integrated with the Knowledge Graph

Wang, G.; Zhu, Y.; Li, S.

Data. Anal. Knowl. Discov. 2024;8(6):95-106

2024

DOI: 10.11925/infotech.2096-3467.2023.0314 · Ref ID: 3955

[Objective] This paper aims to improve the accuracy of real-time rumor detection in social media and reduce the harm caused by rumors. [Methods] A real-time rumor detection method based on the graph attention neural network integrated with the knowledge graph is proposed. First, the background knowledge of the text is obtained from the external knowledge graph by knowledge distillation. Second, we transformed the text and background knowledge into a weighted graph structure representation by point mutual information, and a weighted graph attention neural network is used to learn the discontinuous semantic features of the text from the weighted graph. Then, the continuous semantic features of the text are learned by the pre-trained language model BERT, and the statistical features of users and content are converted into continuous vector representations using the embedding method. Finally, all the features are fused and input into the fully connected neural network for rumor detection. [Results] Experimental results on two public social media rumor datasets, PHEME and WEIBO, show that the method's accuracy reaches 92.1% and 84.0%, respectively, higher than the state-of-the-art baseline methods. [Limitations] The method does not fuse the image or video information that may be attached to the post and cannot perform multi-modal fusion rumor detection. [Conclusions] Fusion of background knowledge can complement the semantic representation of short texts. Fusion of user and content statistical features can support semantic features in decision making and improve the accuracy of the model. © 2024 Chinese Academy of Sciences. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#593 - Wang 2024
Multivariate graph neural networks on enhancing syntactic and semantic for aspect-based sentiment analysis

Wang, H. Y.; Qiu, X. H.; Tan, X. Y.

Appl. Intell. 2024;54(22):11672-11689

2024

DOI: 10.1007/s10489-024-05802-6 · Ref ID: 3754

Aspect-based sentiment analysis (ABSA) aims to predict sentiment orientations towards textual aspects by extracting insights from user comments. While pretrained large language models (LLMs) demonstrate proficiency in sentiment analysis, incorporating syntactic and semantic features into ABSA remains a challenge. Additionally, employing LLMs for sentiment analysis often requires significant computational resources, rendering them impractical for use by individuals or small-scale entities. To address this, we propose the semiotic signal integration network (SSIN), which effectively combines syntactic and semantic features. The core syncretic information network leverages isomorphism and syntax to enhance knowledge acquisition. The semantically guided syntactic attention module further enables integrated semiotic representations via sophisticated attention mechanisms. Experiments on the publicly available SemEval dataset show that SSIN performs better than existing state-of-the-art ABSA baselines and LLMs such as Llama and Alpaca with high accuracy and macro-F1 scores. Moreover, our model demonstrates exceptional interpretability and the ability to discern both positive and negative sentiments, which is vitally important for real-world applications such as social media monitoring, health care, and customer service. Code is available at https://github.com/AmbitYuki/SSIN.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3230 - Wang 2024
BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering

Wang, Haoyu; Li, Ruirui; Jiang, Haoming; Tian, Jinjin; Wang, Zhengyang; Luo, Chen; Tang, Xianfeng; Cheng, Monica; Zhao, Tuo; Gao, Jing

arXiv 2024;():

2024

Ref ID: 8110

Retrieval-augmented Large Language Models (LLMs) offer substantial benefits in enhancing performance across knowledge-intensive scenarios. However, these methods often face challenges with complex inputs and encounter difficulties due to noisy knowledge retrieval, notably hindering model effectiveness. To address this issue, we introduce BlendFilter, a novel approach that elevates retrieval-augmented LLMs by integrating query generation blending with knowledge filtering. BlendFilter proposes the blending process through its query generation method, which integrates both external and internal knowledge augmentation with the original query, ensuring comprehensive information gathering. Additionally, our distinctive knowledge filtering module capitalizes on the intrinsic capabilities of the LLM, effectively eliminating extraneous data. We conduct extensive experiments on three open-domain question answering benchmarks, and the findings clearly indicate that our innovative BlendFilter surpasses state-of-the-art baselines significantly.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3030 - Wang 2006
Towards Representing FCA-based Ontologies in Semantic Web Rule Language

Wang, J.; He, K.

The Sixth IEEE International Conference on Computer and Information Technology (CIT'06) 2006;():41-41

2006

DOI: 10.1109/CIT.2006.186 · Ref ID: 6139

Formal Concept Analysis (FCA) has been widely applied in many fields recently. In this paper, we introduce how a domain ontology can be constructed based on FCA. The consequential ontology constructed in this way is graphically represented as a concept lattice. After constructing FCA-based ontologies, it is necessary to represent the FCAbased ontologies in a formalism for sharing and reasoning. Semantic Web Rule Language(SWRL), a W3C proposal, is an extension with Horn clause rules on OWL. We represent the FCA-based ontologies in SWRL with some extension, which can be more suitable for reasoning.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#360 - Wang 2024
Improving the Robustness of Knowledge -Grounded Dialogue via Contrastive Learning

Wang, J.; Qu, J. F.; Wang, K. X.; Li, Z. X.; Hua, W.; Li, X. M.; Liu, A.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():19135-19143

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3483

Knowledge -grounded dialogue (KGD) learns to generate an informative response based on a given dialogue context and external knowledge (e.g., knowledge graphs; KGs). Recently, the emergence of large language models (LLMs) and pre training techniques has brought great success to knowledge grounded dialogue. However, when building KGD systems in real applications, there are various real-world noises that are inevitable to face. For example, the dialogue context might involve perturbations such as misspellings and abbreviations. In addition, KGs typically suffer from incompletion and also might contain erroneous and outdated facts. Such real-world noises pose a challenge to the robustness of KGD systems and hinder their applications in the real world. In this paper, we propose an entity-based contrastive learning framework for improving the robustness of KGD. Specifically, we make use of the entity information in a KGD sample to create both its positive and negative samples which involve semantic irrelevant and semantic-relevant perturbations, respectively. The contrastive learning framework ensures the KGD model is aware of these two types of perturbations, thus generating informative responses with the potentially noisy inputs in real applications. Experimental results on three benchmark datasets show that our method achieves new state-of-the-art performance in terms of automatic evaluation scores, verifying its effectiveness and potentiality. Furthermore, we show that our method can generate better responses than comparison models in both the noisy and the few -shot settings.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#3117 - Wang 2024
Causal-driven Large Language Models with Faithful Reasoning for Knowledge Question Answering

Wang, Jiawei; Cao, Da; Lu, Shaofei; Ma, Zhanchang; Xiao, Junbin; Chua, Tat-Seng

Proceedings of the 32nd ACM International Conference on Multimedia 2024;():4331–4340

Melbourne VIC, Australia Association for Computing Machinery 2024

DOI: 10.1145/3664647.3681263 · Ref ID: 7306

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3557 - Wang 2024
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability

Wang, Junda; Yang, Zhichao; Yao, Zonghai; Yu, Hong

arXiv 2024;():

2024

Ref ID: 8145

Large Language Models (LLMs) have demonstrated a remarkable potential in medical knowledge acquisition and question-answering. However, LLMs can potentially hallucinate and yield factually incorrect outcomes, even with domain-specific pretraining. Previously, retrieval augmented generation (RAG) has limited success in addressing hallucinations. Unlike previous methods in RAG where the retrieval model was trained separately from the LLM, we introduce JMLR (for Jointly trains LLM and information Retrieval) during the fine-tuning phase. The synchronized training mechanism enhances JMLR's ability to retrieve clinical guidelines and leverage medical knowledge to reason and answer questions and reduces the demand for computational resources. We evaluated JMLR on the important medical question-answering application. Our experimental results demonstrate that JMLR-13B (70.5%) outperforms a previous state-of-the-art open-source model using conventional pre-training and fine-tuning Meditron-70B (68.9%) and Llama2-13B with RAG (67.7%) on a medical question-answering dataset. Comprehensive evaluations reveal JMLR-13B enhances reasoning quality and reduces hallucinations better than Claude3-Opus. Additionally, JMLR-13B (148 GPU hours) also trains much faster than Meditron-70B (42630 GPU hours). Through this work, we provide a new and efficient knowledge enhancement method for healthcare, demonstrating the potential of integrating retrieval and LLM training for medical question-answering systems.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1607 - Wang 2024
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs

Wang, K.; Xu, Y.; Wu, Z.; Luo, S.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():3742-3759

Association for Computational Linguistics (ACL) 2024

Ref ID: 4284

Knowledge Graph (KG) inductive reasoning, which aims to infer missing facts from new KGs that are not seen during training, has been widely adopted in various applications. One critical challenge of KG inductive reasoning is handling low-resource scenarios with scarcity in both textual and structural aspects. In this paper, we attempt to address this challenge with Large Language Models (LLMs). Particularly, we utilize the state-of-the-art LLMs to generate a graph-structural prompt to enhance the pre-trained Graph Neural Networks (GNNs), which brings us new methodological insights into the KG inductive reasoning methods, as well as high generalizability in practice. On the methodological side, we introduce a novel pretraining and prompting framework PROLINK, designed for low-resource inductive reasoning across arbitrary KGs without requiring additional training. On the practical side, we experimentally evaluate our approach on 36 low-resource KG datasets and find that PROLINK outperforms previous methods in three-shot, one-shot, and zero-shot reasoning tasks, exhibiting average performance improvements by 20%, 45%, and 147%, respectively. Furthermore, PROLINK demonstrates strong robustness for various LLM promptings as well as full-shot scenarios. Our source code is available on https://github.com/KyneWang/ProLINK. © 2024 Association for Computational Linguistics.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3678 - Wang 2024
LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation

Wang, Keheng; Duan, Feiyu; Li, Peiguang; Wang, Sirui; Cai, Xunliang

arXiv 2024;():

2024

Ref ID: 8246

Retrieval-Augmented Generation (RAG) demonstrates great value in alleviating outdated knowledge or hallucination by supplying LLMs with updated and relevant knowledge. However, there are still several difficulties for RAG in understanding complex multi-hop query and retrieving relevant documents, which require LLMs to perform reasoning and retrieve step by step. Inspired by human's reasoning process in which they gradually search for the required information, it is natural to ask whether the LLMs could notice the missing information in each reasoning step. In this work, we first experimentally verified the ability of LLMs to extract information as well as to know the missing. Based on the above discovery, we propose a Missing Information Guided Retrieve-Extraction-Solving paradigm (MIGRES), where we leverage the identification of missing information to generate a targeted query that steers the subsequent knowledge retrieval. Besides, we design a sentence-level re-ranking filtering approach to filter the irrelevant content out from document, along with the information extraction capability of LLMs to extract useful information from cleaned-up documents, which in turn to bolster the overall efficacy of RAG. Extensive experiments conducted on multiple public datasets reveal the superiority of the proposed MIGRES method, and analytical experiments demonstrate the effectiveness of our proposed modules.

Kwesi voted
yuexi voted
Final decision
What was the agreed final decision?

#533 - Wang 2024
LLM-Assisted Analytics in Semiconductor Test (Invited)

Wang, L. C.; Acm

6th International Symposium on Machine Learning for CAD (MLCAD) 2024;():

Snowbird, UT Assoc Computing Machinery 2024

DOI: 10.1145/3670474.3685974 · Ref ID: 3222

The emergence of Large Language Models (LLMs) has impacted our perspective on applying Machine Learning (ML) in semiconductor test. This paper shares our experience in leveraging the power of LLMs to build an AI agent for test data analytics. We advocate for an end-to-end approach where the Knowledge Graph (KG) plays a central role. Using wafermap analytics as an example, we highlight the key ideas behind developing the LLM-assisted AI agent named IEA-Plot, and discuss its practical applications.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#454 - Wang 2023
Knowledge Graph-Based Method for Intelligent Generation of Emergency Plans for Water Conservancy Projects

Wang, L. H.; Liu, X. M.; Liu, Y.; Li, H. R.; Liu, J. Q.; Yang, L. B.

IEEE Access 2023;11():84414-84429

2023

DOI: 10.1109/access.2023.3302399 · Ref ID: 3110

In response to the issues of poor content correlation and insufficient intelligent decision support in emergency plans for water conservancy projects, a method for intelligent generation of emergency plans based on knowledge graphs is proposed. Utilizing pre-trained language models (PTM) based on entity masking, the accuracy of entity recognition tasks is enhanced by uncovering contextual features surrounding the masked entities. By employing translations, rotations, and superpositions within the vector space, a multiview convolutional neural network (MCNN) is constructed to enhance the accuracy of relation extraction through complementary and integrated feature representation. Integrating PTM with MCNN enables the construction of an emergency entity relationship extraction method based on PTM-MCNN. Neo4j is utilized for storing entity relationship triplets to construct an emergency knowledge graph. Through the utilization of the mutual information criterion, knowledge retrieval and matching are performed to accomplish the intelligent generation of emergency plans. The results indicate that PTM-MCNN achieves high recognition accuracy (F1 score of 92.2%), ensuring the reliability of the generated emergency plans. Related studies can effectively improve the intelligence of emergency management of water conservancy projects.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#772 - Wang 2022
SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models

Wang, L.; Zhao, W.; Wei, Z. Y.; Liu, J. M.; Assoc Computat, Linguist

60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():4281-4294

Dublin, IRELAND Assoc Computational Linguistics-Acl 2022

Ref ID: 3405

Knowledge graph completion (KGC) aims to reason over known facts and infer the missing links. Text-based methods such as KG-BERT (Yao et al., 2019) learn entity representations from natural language descriptions, and have the potential for inductive KGC. However, the performance of text-based methods still largely lag behind graph embedding-based methods like TransE (Bordes et al., 2013) and RotatE (Sun et al., 2019b). In this paper, we identify that the key issue is efficient contrastive learning. To improve the learning efficiency, we introduce three types of negatives: in-batch negatives, pre-batch negatives, and self-negatives which act as a simple form of hard negatives. Combined with InfoNCE loss, our proposed model SimKGC can substantially outperform embedding-based methods on several benchmark datasets. In terms of mean reciprocal rank (MRR), we advance the state-of-the-art by +19% on WN18RR, +6.8% on the Wikidata5M transductive setting, and +22% on the Wikidata5M inductive setting. Thorough analyses are conducted to gain insights into each component. Our code is available at https://github.com/intfloat/SimKGC.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1150 - Wang 2023
Cross-Modal Knowledge Discovery, Inference, and Challenges

Wang, M.; Zhang, N.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13759 LNCS():199-209

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-31414-8_6 · Ref ID: 5271

In recent years, multimodal knowledge has become a popular research topic in many fields, such as knowledge graphs and natural language processing. Multimodal knowledge involves multimodal knowledge graphs, multimodal pre-trained language models, multimodal knowledge inference, etc.; from online shopping to medical care, whether it is theoretical research or engineering application, the knowledge representation, discovery, and inference of multimodal knowledge have become the core technologies of the academic and industrial concern. This tutorial focuses on the state of the art of cross-modal knowledge discovery and inference and presents future research opportunities and challenges. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1789 - Wang 2023
Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings

Wang, P.; Xie, X.; Wang, X.; Zhang, N.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14302 LNAI():111-122

Springer Science and Business Media Deutschland GmbH 2023

DOI: 10.1007/978-3-031-44693-1_9 · Ref ID: 5141

Previous knowledge graph embedding approaches usually map entities to representations and utilize score functions to predict the target entities, yet they typically struggle to reason rare or emerging unseen entities. In this paper, we propose kNN-KGE, a new knowledge graph embedding approach with pre-trained language models, by linearly interpolating its entity distribution with k-nearest neighbors. We compute the nearest neighbors based on the distance in the entity embedding space from the knowledge store. Our approach can allow rare or emerging entities to be memorized explicitly rather than implicitly in model parameters. Experimental results demonstrate that our approach can improve inductive and transductive link prediction results and yield better performance for low-resource settings with only a few triples, which might be easier to reason via explicit memory (Code is available at: https://github.com/zjunlp/KNN-KG ). © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1636 - Wang 2023
A Medical Question Classification Approach Based on Prompt Tuning and Contrastive Learning

Wang, Q.; Zeng, C.; Liu, Y.; He, P.

Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE 2023;2023-July():632-635

Knowledge Systems Institute Graduate School 2023

DOI: 10.18293/SEKE2023-025 · Ref ID: 5292

COVID-19 has profoundly impacted people's lives, and people are more concerned about medical and health issues, so it is essential to design an efficient method for classifying medical questions. Fine-tuning paradigms based on pre-trained language models have proven effective in recent years. However, PLMs based on fine-tuning paradigms are poorly robust, and there is a gap between the pre-training phase and the downstream task form, resulting in PLMs that cannot use the rich latent knowledge in downstream tasks. We propose a medical question classification method that combines prompt fine-tuning and contrastive learning and uses the large-scale knowledge graph enhancement model ERNIE 3.0 as a feature extractor to address both problems. Our approach utilizes an additional prompt template to enable PLM to unleash the potential in specific tasks and uses a contrast sample strategy to alleviate the problem of confusable samples that are difficult to distinguish. Experiments on a medical question classification dataset show that the method achieves an accuracy of 93.65 percent, with better metrics than recent work. © 2023 Knowledge Systems Institute Graduate School. All rights reserved.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1389 - Wang 2024
IDEATE: Detecting AI-Generated Text using Internal and External Factual Structures

Wang, Q.; Zhang, L.; Guo, Z.; Mao, Z.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():8556-8568

European Language Resources Association (ELRA) 2024

Ref ID: 4590

The effective detection of AI-generated text is a vital principle to ensure responsible use of large language models (LLMs). Previous studies mainly focused on discovering and utilizing internal evidences contained in the text itself to perform the detection, while ignoring external evidences implicated in an established knowledge graph (KG) which may also be key discriminative factors between AI-generated and human-written text. To address this deficiency, we propose IDEATE, a novel hierarchical graph network that utilizes both internal and external factual structures to detect AI-generated text. IDEATE consists of a mention-level subgraph at the bottom to describe internal factual structures of mentioned entities reflected in the input text, and an entity-level subgraph at the top to describe external factual structures of mentioned entities reflected in an external KG. Hierarchical graph convolution is then applied successively on the two subgraphs, through which the two types of factual structures will be embedded into the output and used for the final detection. Extensive experiments on four benchmarking datasets show that IDEATE consistently outperforms current state-of-the-art methods in detecting text generated by various LLMs, ranging from GPT-2 to the more powerful ChatGPT, verifying the necessity and superiority of introducing external evidences for AI-generated text detection. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#1634 - Wang 2024
Medical Knowledge Graph Question-Answering System Based on Hybrid Dynamic Masking and Multi-strategy Fusion

Wang, R.; Zhang, X.

J. Frontier. Comput. Sci. Technol. 2024;18(10):2770-2786

2024

DOI: 10.3778/j.issn.1673-9418.2401072 · Ref ID: 3854

Medical knowledge graph question-answering combines medical knowledge and natural language processing technology to provide accurate and fast question-answering services for medical practitioners and patients. However, the current Chinese medical knowledge graphs are not comprehensive enough due to the surge in data. Additionally, the complex and ambiguous nature of medical questions poses a significant challenge in accurately identifying entity information and generating answers that are both easily comprehensible and accessible to the public. This paper proposes a medical knowledge graph question-answering framework based on hybrid dynamic masking and multi-strategy fusion. Initially, a medical knowledge graph encompassing 34167 entities and 297463 relationships is constructed by integrating public datasets and disease knowledge from medical platforms, covering categories such as diseases, medications, and food. Subsequently, a BERT-MaskAttention-BiLSTM-CRF hybrid dynamic masking model is introduced to accurately identify medical entity information in the input, effectively focusing on essential content and eliminating interference from redundant information. Finally, entity alignment strategies are employed to unify and standardize medical entities, while intent recognition strategies delve into users’query intentions. This is coupled with the use of large language models to refine the output from the knowledge graph, ensuring that the responses are more readily comprehensible. Experimental results demonstrate that the model achieves a macro-average F1 score of 0.9602 in entity recognition comparative experiments and an average accuracy of 0.9656 in question-answering tests. The generated content is more easily comprehensible and interpretable. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#692 - Wang 2023
Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs

Wang, S. Y.; Wei, Z. Y.; Han, M.; Fan, Z. H.; Shan, H. J.; Zhang, Q.; Huang, X. J.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():4706-4718

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3194

Logical reasoning over incomplete knowledge graphs to answer complex logical queries is a challenging task. With the emergence of new entities and relations in constantly evolving KGs, inductive logical reasoning over KGs has become a crucial problem. However, previous PLMs-based methods struggle to model the logical structures of complex queries, which limits their ability to generalize within the same structure. In this paper, we propose a structuremodeled textual encoding framework for inductive logical reasoning over KGs. It encodes linearized query structures and entities using pre-trained language models to find answers. For structure modeling of complex queries, we design stepwise instructions that implicitly prompt PLMs on the execution order of geometric operations in each query. We further separately model different geometric operations (i.e., projection, intersection, and union) on the representation space using a pre-trained encoder with additional attention and maxout layers to enhance structured modeling. We conduct experiments on two inductive logical reasoning datasets and three transductive datasets. The results demonstrate the effectiveness of our method on logical reasoning over KGs in both inductive and transductive settings.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3709 - Wang 2024
MGSA: Multi-Granularity Graph Structure Attention for Knowledge Graph-to-Text Generation

Wang, Shanshan; Zhang, Chun; Zhang, Ning

arXiv 2024;():

2024

Ref ID: 8602

The Knowledge Graph-to-Text Generation task aims to convert structured knowledge graphs into coherent and human-readable natural language text. Recent efforts in this field have focused on enhancing pre-trained language models (PLMs) by incorporating graph structure information to capture the intricate structure details of knowledge graphs. However, most of these approaches tend to capture only single-granularity structure information, concentrating either on the relationships between entities within the original graph or on the relationships between words within the same entity or across different entities. This narrow focus results in a significant limitation: models that concentrate solely on entity-level structure fail to capture the nuanced semantic relationships between words, while those that focus only on word-level structure overlook the broader relationships between original entire entities. To overcome these limitations, this paper introduces the Multi-granularity Graph Structure Attention (MGSA), which is based on PLMs. The encoder of the model architecture features an entity-level structure encoding module, a word-level structure encoding module, and an aggregation module that synthesizes information from both structure. This multi-granularity structure encoding approach allows the model to simultaneously capture both entity-level and word-level structure information, providing a more comprehensive understanding of the knowledge graph's structure information, thereby significantly improving the quality of the generated text. We conducted extensive evaluations of the MGSA model using two widely recognized KG-to-Text Generation benchmark datasets, WebNLG and EventNarrative, where it consistently outperformed models that rely solely on single-granularity structure information, demonstrating the effectiveness of our approach.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#784 - Wang 2023
SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance

Wang, T.; Shen, B.; Zhong, Y.

Appl. Intell. 2023;53(21):25171-25183

2023

DOI: 10.1007/s10489-023-04896-8 · Ref ID: 2969

In knowledge graph embedding, an attempt is made to embed the objective facts and relationships expressed in the form of triplets into multidimensional vector space, facilitating various applications, such as link prediction and question answering. Structure embedding models focus on the graph structure while the importance of language semantics in inferring similar entities and relations is ignored. Semantic embedding models use pretrained language models to learn entity and relation embeddings based on text information, but they do not fully exploit graph structures that reflect relation patterns and mapping attributes. Structure and semantic information in knowledge graphs represent different hierarchical properties that are indispensable for comprehensive knowledge representation. In this paper, we propose a general knowledge graph embedding framework named SSKGE, which considers both the graph structure and language semantics and learns these two complementary characteristics to integrate entity and relation representations. To compensate for semantic embedding approaches that ignore the graph structure, we first design a structure loss function to explicitly model the graph structure attributes. Second, we leverage a pretrained language model that has been fine-tuned by the structure loss to guide the structure embedding approaches in enhancing the semantic information they lack and obtaining universal knowledge representations. Specifically, guidance is provided by a distance function that makes the spatial distribution of the two types of graph embeddings have a certain similarity. SSKGE significantly reduces the time cost of using a pretrained language model to complete a knowledge graph. Common knowledge graph embedding models such as TransE, DistMult, ComplEx, RotatE, PairRE, and HousE have achieved better results with multiple datasets, including FB15k, FB15k-237, WN18, and WN18RR, using the SSKGE framework. Extensive experiments and analyses have verified the effectiveness and practicality of SSKGE.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3358 - Wang 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models

Wang, Wenxuan; Shi, Juluan; Tu, Zhaopeng; Yuan, Youliang; Huang, Jen-tse; Jiao, Wenxiang; Lyu, Michael R.

arXiv 2024;():

2024

Ref ID: 8021

Large Language Models (LLMs) like ChatGPT are foundational in various applications due to their extensive knowledge from pre-training and fine-tuning. Despite this, they are prone to generating factual and commonsense errors, raising concerns in critical areas like healthcare, journalism, and education to mislead users. Current methods for evaluating LLMs' veracity are limited by test data leakage or the need for extensive human labor, hindering efficient and accurate error detection. To tackle this problem, we introduce a novel, automatic testing framework, FactChecker, aimed at uncovering factual inaccuracies in LLMs. This framework involves three main steps: First, it constructs a factual knowledge graph by retrieving fact triplets from a large-scale knowledge database. Then, leveraging the knowledge graph, FactChecker employs a rule-based approach to generates three types of questions (Yes-No, Multiple-Choice, and WH questions) that involve single-hop and multi-hop relations, along with correct answers. Lastly, it assesses the LLMs' responses for accuracy using tailored matching strategies for each question type. Our extensive tests on six prominent LLMs, including text-davinci-002, text-davinci-003, ChatGPT (gpt-3.5-turbo, gpt-4), Vicuna, and LLaMA-2, reveal that FactChecker can trigger factual errors in up to 45% of questions in these models. Moreover, we demonstrate that FactChecker's test cases can improve LLMs' factual accuracy through in-context learning and fine-tuning (e.g., llama-2-13b-chat's accuracy increase from 35.3% to 68.5%). We are making all code, data, and results available for future research endeavors.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1556 - Wang 2022
Language Models as Knowledge Embeddings

Wang, X.; He, Q.; Liang, J.; Xiao, Y.

IJCAI International Joint Conference on Artificial Intelligence 2022;():2291-2297

International Joint Conferences on Artificial Intelligence 2022

Ref ID: 5443

Knowledge embeddings (KE) represent a knowledge graph (KG) by embedding entities and relations into continuous vector spaces. Existing methods are mainly structure-based or description-based. Structure-based methods learn representations that preserve the inherent structure of KGs. They cannot well represent abundant long-tail entities in real-world KGs with limited structural information. Description-based methods leverage textual information and language models. Prior approaches in this direction barely outperform structure-based ones, and suffer from problems like expensive negative sampling and restrictive description demand. In this paper, we propose LMKE, which adopts Language Models to derive Knowledge Embeddings, aiming at both enriching representations of long-tail entities and solving problems of prior description-based methods. We formulate description-based KE learning with a contrastive learning framework to improve efficiency in training and evaluation. Experimental results show that LMKE achieves state-of-the-art performance on KE benchmarks of link prediction and triple classification, especially for long-tail entities. © 2022 International Joint Conferences on Artificial Intelligence. All rights reserved.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1919 - Wang 2023
A Survey of Pre-Trained Language Models IncorporatingKnowledge Graphs

Wang, X.; Liu, J.; Zhou, J.; Wang, J.

2023 IEEE International Conference on Electrical, Automation and Computer Engineering, ICEACE 2023 2023;():1706-1710

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICEACE60673.2023.10442824 · Ref ID: 4986

Pre-trained models acquire knowledge from vast amounts of unannotated and unstructured data through self-supervised learning. However, they suffer from limitations such as inadequate performance and limited knowledge reasoning capabilities due to the lack of external knowledge guidance. To address these limitations, integrating structured knowledge from knowledge graphs into pretrained models enables them to acquire both general semantic knowledge from free text and real-world knowledge behind the text, thereby effectively addressing downstream knowledge-driven tasks. This paper introduces the concepts of pretrained models and knowledge graphs, discusses research advancements, provides an overview of methods for integrating knowledge into pretrained models, and proposes three classification approaches based on fusion methods. It also outlines the application domains where these approaches can be applied. Finally, the paper summarizes and discusses future research directions for pretrained models Integrated with knowledge. © 2023 IEEE.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#1533 - Wang 2023
Knowledge-enhanced Pre-Training large language model for depression diagnosis and treatment

Wang, X.; Liu, K.; Wang, C.

Proceeding of 2023 9th IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2023 2023;():532-536

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/CCIS59572.2023.10263217 · Ref ID: 5188

Depression, a pervasive psychiatric disorder characterized by concealment, dependence on expert judgment, and a notable rate of misdiagnosis, poses a substantial burden on society. To enhance the diagnosis and treatment of depression, this study puts forth a proposition of employing knowledge-enhanced pre-Training technology leveraging large language models. By integrating domain knowledge and depression knowledge graph directives, the pre-Trained model undergoes optimization. Expert involvement in depression diagnosis and treatment fosters a guided learning process facilitated by expert feedback. Through the application of dialogue therapy, the efficacy of treatment is augmented. This technical approach aims to ameliorate the societal burden by improving the diagnosis and treatment of depressed individuals. © 2023 IEEE.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#3148 - Wang 2024
Medical knowledge graph completion via fusion of entity description and type information

Wang, Xiaochen; Zhang, Runtong; Zhao, Butian; Yao, Yuhan; Zhao, Hongmei; Zhu, Xiaomin

Artif. Intell. Med. 2024;151(C):11

2024

DOI: 10.1016/j.artmed.2024.102848 · Ref ID: 7137

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#3175 - Wang 2024
AceMap: Knowledge Discovery through Academic Graph

Wang, Xinbing; Fu, Luoyi; Gan, Xiaoying; Wen, Ying; Zheng, Guanjie; Ding, Jiaxin; Xiang, Liyao; Ye, Nanyang; Jin, Meng; Liang, Shiyu; Lu, Bin; Wang, Haiwen; Xu, Yi; Deng, Cheng; Zhang, Shao; Kang, Huquan; Wang, Xingli; Li, Qi; Guo, Zhixin; Qi, Jiexing; Liu, Pan; Ren, Yuyang; Wu, Lyuwen; Yang, Jungang; Zhou, Jianping; Zhou, Chenghu

arXiv 2024;():

2024

Ref ID: 8160

The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publications. The representation of heterogeneous graphs and the effective measurement, analysis, and mining of such graphs pose significant challenges. To address these challenges, we present AceMap, an academic system designed for knowledge discovery through academic graph. We present advanced database construction techniques to build the comprehensive AceMap database with large-scale academic entities that contain rich visual, textual, and numerical information. AceMap also employs innovative visualization, quantification, and analysis methods to explore associations and logical relationships among academic entities. AceMap introduces large-scale academic network visualization techniques centered on nebular graphs, providing a comprehensive view of academic networks from multiple perspectives. In addition, AceMap proposes a unified metric based on structural entropy to quantitatively measure the knowledge content of different academic entities. Moreover, AceMap provides advanced analysis capabilities, including tracing the evolution of academic ideas through citation relationships and concept co-occurrence, and generating concise summaries informed by this evolutionary process. In addition, AceMap uses machine reading methods to generate potential new ideas at the intersection of different fields. Exploring the integration of large language models and knowledge graphs is a promising direction for future research in idea evolution. Please visit \url{https://www.acemap.info} for further exploration.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3858 - Wang 2024
SciDaSynth: Interactive Structured Knowledge Extraction and Synthesis from Scientific Literature with Large Language Model

Wang, Xingbo; Huey, Samantha L.; Sheng, Rui; Mehta, Saurabh; Wang, Fei

arXiv 2024;():

2024

Ref ID: 8243

Extraction and synthesis of structured knowledge from extensive scientific literature are crucial for advancing and disseminating scientific progress. Although many existing systems facilitate literature review and digest, they struggle to process multimodal, varied, and inconsistent information within and across the literature into structured data. We introduce SciDaSynth, a novel interactive system powered by large language models (LLMs) that enables researchers to efficiently build structured knowledge bases from scientific literature at scale. The system automatically creates data tables to organize and summarize users' interested knowledge in literature via question-answering. Furthermore, it provides multi-level and multi-faceted exploration of the generated data tables, facilitating iterative validation, correction, and refinement. Our within-subjects study with researchers demonstrates the effectiveness and efficiency of SciDaSynth in constructing quality scientific knowledge bases. We further discuss the design implications for human-AI interaction tools for data extraction and structuring.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3640 - Wang 2024
LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation

Wang, Xinyuan; Li, Haozhou; Zheng, Dingfang; Peng, Qinke

arXiv 2024;():

2024

Ref ID: 8660

The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), comprising a Coarse-grained Triage dataset with 439,630 samples, a Fine-grained Diagnosis dataset with 199,600 samples, and a Medical Consultation dataset with 472,418 items, thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model using reinforcement learning. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#569 - Wang 2022
Military Chain: Construction of Domain Knowledge Graph of Kill Chain Based on Natural Language Model

Wang, Y. F.; Wang, T.; Wang, J. H.; Zhou, X.; Gao, M.; Liu, R. M.

Mob. Inf. Syst. 2022;2022():11

2022

DOI: 10.1155/2022/7097385 · Ref ID: 3528

With the advent of the Big Data era, the specialized data in the kill chain domain has increased dramatically, and the engine-based method of retrieving information can hardly meet the users' need for more accurate answers. The kill chain domain includes four components: control equipment, sensor equipment, strike equipment (weapon and platform), and evaluator equipment, as well as related data which contain a large amount of valuable information such as the parameter information contained in each component. If these fragmented and confusing data are integrated and effective query methods are established, they can help professionals complete the military kill chain knowledge system. The knowledge system constructed in this paper is based on the Neo4j graph database and the US Command simulation system to establish a target-oriented knowledge map of kill chain, aiming to provide data support for the Q&A system. Secondly, in order to facilitate the query, this paper establishes entity and relationship/attribute mining based on the continuous bag-of-words (CBOW) encoding model, bidirectional long short-term memory-conditional random field (BiLSTM-CRF) named entity model, and bidirectional gated recurrent neural network (BiGRU) intent recognition model for Chinese kill chain question and answer; returns the corresponding entity or attribute values in combination with the knowledge graph triad form; and finally constructs the answer return. The constructed knowledge map of the kill chain contains 2767 items (including sea, land, and air), and the number of parameters involved is 30124. The number of model parameters of the deep learning network is 27.9 M for the Q&A system built this time, and the accuracy rate is 85.5% after 200 simulated queries.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1464 - Wang 2024
KC-GenRe: A Knowledge-constrained Generative Re-ranking Method Based on Large Language Models for Knowledge Graph Completion

Wang, Y.; Hu, M.; Huang, Z.; Li, D.; Yang, D.; Lu, X.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9668-9680

European Language Resources Association (ELRA) 2024

Ref ID: 4540

The goal of knowledge graph completion (KGC) is to predict missing facts among entities. Previous methods for KGC re-ranking are mostly built on non-generative language models to obtain the probability of each candidate. Recently, generative large language models (LLMs) have shown outstanding performance on several tasks such as information extraction and dialog systems. Leveraging them for KGC re-ranking is beneficial for leveraging the extensive pre-trained knowledge and powerful generative capabilities. However, it may encounter new problems when accomplishing the task, namely mismatch, misordering and omission. To this end, we introduce KC-GenRe, a knowledge-constrained generative re-ranking method based on LLMs for KGC. To overcome the mismatch issue, we formulate the KGC re-ranking task as a candidate identifier sorting generation problem implemented by generative LLMs. To tackle the misordering issue, we develop a knowledge-guided interactive training method that enhances the identification and ranking of candidates. To address the omission issue, we design a knowledge-augmented constrained inference method that enables contextual prompting and controlled generation, so as to obtain valid rankings. Experimental results show that KG-GenRe achieves state-of-the-art performance on four datasets, with gains of up to 6.7% and 7.7% in the MRR and Hits@1 metric compared to previous methods, and 9.0% and 11.1% compared to that without re-ranking. Extensive analysis demonstrates the effectiveness of components in KG-GenRe. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#193 - Wang 2023
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering

Wang, Y. J.; Zhang, H.; Liang, J. Y.; Li, R.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():14048-14063

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3248

Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA. We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1964 - Wang 2023
Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction

Wang, Y.; Lu, D.; Kong, C.; Sang, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():4420-4432

Association for Computational Linguistics (ACL) 2023

Ref ID: 5178

Many works employed prompt tuning methods to automatically optimize prompt queries and extract the factual knowledge stored in Pretrained Language Models. In this paper, we observe that the optimized prompts, including discrete prompts and continuous prompts, exhibit undesirable object bias. To handle this problem, we propose a novel prompt tuning method called MeCoD consisting of three modules: Prompt Encoder, Object Equalization and Biased Object Obstruction. Experimental results show that MeCoD can significantly reduce the object bias and at the same time improve accuracy of factual knowledge extraction. © 2023 Association for Computational Linguistics.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#676 - Wang 2023
Prompt-based Zero-shot Text Classification with Conceptual Knowledge

Wang, Y. Q.; Wang, W.; Chen, Q.; Huang, K. Z.; Nguyen, A.; De, S.

61st Annual Meeting of the Association-for-Computational-Linguistics / Student Research Workshop (ACL-SRW) 2023;():30-38

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3451

In recent years, pre-trained language models have garnered significant attention due to their effectiveness, which stems from the rich knowledge acquired during pre-training. To mitigate the inconsistency issues between pre-training tasks and downstream tasks and to facilitate the resolution of language-related issues, prompt-based approaches have been introduced, which are particularly useful in low-resource scenarios. However, existing approaches mostly rely on verbalizers to translate the predicted vocabulary to task-specific labels. The major limitations of this approach are the ignorance of potentially relevant domain-specific words and being biased by the pre-training data. To address these limitations, we propose a framework that incorporates conceptual knowledge for text classification in the extreme zero-shot setting. The framework includes prompt-based keyword extraction, weight assignment to each prompt keyword, and final representation estimation in the knowledge graph embedding space. We evaluated the method on four widely-used datasets for sentiment analysis and topic detection, demonstrating that it consistently outperforms recently-developed prompt-based approaches in the same experimental settings.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2046 - Wang 2024
Zero-Shot Medical Information Retrieval via Knowledge Graph Embedding

Wang, Y.; Wang, Z.; Wang, W.; Chen, Q.; Huang, K.; Nguyen, A.; De, S.

Communications in Computer and Information Science 2024;2019 CCIS():29-40

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-3-031-52216-1_3 · Ref ID: 4731

In the era of the Internet of Things (IoT), the retrieval of relevant medical information has become essential for efficient clinical decision-making. This paper introduces MedFusionRank, a novel approach to zero-shot medical information retrieval (MIR) that combines the strengths of pre-trained language models and statistical methods while addressing their limitations. The proposed approach leverages a pre-trained BERT-style model to extract compact yet informative keywords. These keywords are then enriched with domain knowledge by linking them to conceptual entities within a medical knowledge graph. Experimental evaluations on medical datasets demonstrate MedFusionRank’s superior performance over existing methods, with promising results with a variety of evaluation metrics. MedFusionRank demonstrates efficacy in retrieving relevant information, even from short or single-term queries. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1528 - Wang 2024
Knowledge-aware Reinforced Language Models for Protein Directed Evolution

Wang, Y.; Zhang, Q.; Qin, M.; Zhuang, X.; Li, X.; Gong, Z.; Wang, Z.; Zhao, Y.; Yao, J.; Ding, K.; Chen, H.

Proceedings of Machine Learning Research 2024;235():52260-52273

ML Research Press 2024

Ref ID: 4311

Directed evolution, a cornerstone of protein optimization, is to harness natural mutational processes to enhance protein functionality. Existing Machine Learning-assisted Directed Evolution (MLDE) methodologies typically rely on data-driven strategies and often overlook the profound domain knowledge in biochemical fields. In this paper, we introduce a novel Knowledge-aware Reinforced Language Model (KnowRLM) for MLDE. An Amino Acid Knowledge Graph (AAKG) is constructed to represent the intricate biochemical relationships among amino acids. We further propose a Protein Language Model (PLM)-based policy network that iteratively samples mutants through preferential random walks on the AAKG using a dynamic sliding window mechanism. The novel mutants are actively sampled to fine-tune a fitness predictor as the reward model, providing feedback to the knowledge-aware policy. Finally, we optimize the whole system in an active learning approach that mimics biological settings in practice. KnowRLM stands out for its ability to utilize contextual amino acid information from knowledge graphs, thus attaining advantages from both statistical patterns of protein sequences and biochemical properties of amino acids. Extensive experiments demonstrate the superior performance of KnowRLM in more efficiently identifying high-fitness mutants compared to existing methods. Copyright 2024 by the author(s)

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3958 - Wang 2024
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons

Wang, Yifei; Chen, Yuheng; Wen, Wanting; Sheng, Yu; Li, Linjing; Zeng, Daniel Dajun

arXiv 2024;():

2024

Ref ID: 8513

In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon.

Mike voted
yuexi voted
Final decision
What was the agreed final decision?

#3783 - Wang 2024
Prometheus Chatbot: Knowledge Graph Collaborative Large Language Model for Computer Components Recommendation

Wang, Yunsheng; Chen, Songhao; Jin, Kevin

arXiv 2024;():

2024

Ref ID: 8491

Knowledge graphs (KGs) are essential in applications such as network alignment, question-answering, and recommender systems (RSs) since they offer structured relational data that facilitate the inference of indirect relationships. However, the development of KG-based RSs capable of processing user inputs in natural language faces significant challenges. Firstly, natural language processing units must effectively handle the ambiguity and variability in human language to interpret user intents accurately. Secondly, the system must precisely identify and link entities, like product names, to their corresponding nodes in KGs. To overcome these challenges, supported by Lenovo, we developed a novel chatbot called "Prometheus," which integrates a KG with a large language model (LLM), specifically designed for recommending computer components. This chatbot can accurately decode user requests and deliver personalized recommendations derived from KGs, ensuring precise comprehension and response to their computer setup needs.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#984 - Wang 2023
AMD Results for OAEI 2023

Wang, Z.

CEUR Workshop Proceedings 2023;3591():146-153

CEUR-WS 2023

Ref ID: 5044

AgreementMakerDeep (AMD) is a new flexible and extensible ontology matching system. It exploits the contextual and structural information of ontologies by infusing knowledge to pre-trained masked language model, and then filter the output mappings using knowledge graph embedding techniques. AMD learns from classes and their relations between classes by constructing vector representations into the low dimensional embedding space with knowledge graph embedding methods. The results demonstrate that AMD achieves a competitive performance in many OAEI tracks, but AMD has limitations for property and instance matching. © 2023 Copyright for this paper by its authors.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#2293 - Wang 2023
CRule: Category-Aware Symbolic Multihop Reasoning on Knowledge Graphs

Wang, Z.; Li, L.; Li, J.; Zhao, P.; Zeng, D.

IEEE Intelligent Systems 2023;38(5):56-64

2023

DOI: 10.1109/MIS.2023.3291567 · Ref ID: 6269

Multihop reasoning is essential in knowledge graph (KG) research and applications. Current methods rely on specific KG entities, while human cognition operates at a more abstract level. This article proposes a category-aware rule-based (CRule) approach for symbolic multihop reasoning. Specifically, given a KG, CRule first categorizes entities and constructs a category-aware KG; it then uses rules retrieved from the categorized KG to perform multihop reasoning on the original KG. Experiments on five datasets show that CRule is simple, is effective, and combines the advantages of symbolic and neural network methods. It overcomes symbolic reasoning’s complexity limitations, can perform reasoning on KGs of more than 300,000 edges, and can be three times more efficient than neural network models.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1222 - Wang 2024
ECoK: Emotional Commonsense Knowledge Graph for Mining Emotional Gold

Wang, Z.; Liu, X.; Hu, M.; Ying, R.; Jiang, M.; Wu, J.; Xie, Y.; Gao, H.; Cheng, R.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():8055-8074

Association for Computational Linguistics (ACL) 2024

Ref ID: 4237

The demand for understanding and expressing emotions in the field of natural language processing is growing rapidly. Knowledge graphs, as an important form of knowledge representation, have been widely utilized in various emotion-related tasks. However, existing knowledge graphs mainly focus on the representation and reasoning of general factual knowledge, while there are still significant deficiencies in the understanding and reasoning of emotional knowledge. In this work, we construct a comprehensive and accurate emotional commonsense knowledge graph, ECoK. We integrate cutting-edge theories from multiple disciplines such as psychology, cognitive science, and linguistics, and combine techniques such as large language models and natural language processing. By mining a large amount of text, dialogue, and sentiment analysis data, we construct rich emotional knowledge and establish the knowledge generation model COMET-ECoK. Experimental results show that ECoK contains high-quality emotional reasoning triples, and the performance of our knowledge generation model surpasses GPT-4-Turbo, which can help downstream tasks better understand and reason about emotions. Our data and code is available from https://github.com/ZornWang/ECoK. © 2024 Association for Computational Linguistics.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1839 - Wang 2024
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

Wang, Z. M.; Peng, Z.; Que, H.; Liu, J.; Zhou, W.; Wu, Y.; Guo, H.; Gan, R.; Ni, Z.; Yang, J.; Zhang, M.; Zhang, Z.; Ouyang, W.; Xu, K.; Huang, S. W.; Fu, J.; Peng, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14743-14777

Association for Computational Linguistics (ACL) 2024

Ref ID: 4225

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4). © 2024 Association for Computational Linguistics.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1245 - Wang 2024
Enhance Large Language Models for Multilingual Sentence Embedding with Knowledge Graph

Wang, Z.; Wu, Y.

Proceedings of the International Joint Conference on Neural Networks 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/IJCNN60899.2024.10650221 · Ref ID: 4282

Sentence representation is a major challenge in natural language processing, especially in multilingual environments. Current approaches to sentence representation using Large Language Models (LLMs) often require large amounts of data for fine-tuning, and research has focused on English content. In addition, comparative datasets translated directly from English can contain many semantic and syntactic errors. To address these issues, we propose a new approach to enhance multilingual sentence embeddings using LLMs and knowledge graphs. We first present a dedicated designed prompt that exploits in-context learning of LLMs for sentence embedding without fine-tuning. We further introduce an innovative method that utilizes knowledge graphs, such as Wikidata, for generating diverse multilingual training data for contrastive finetuning. This approach significantly reduces the reliance on translated sentences and mitigates issues related to translation accuracy. Furthermore, we develop a unique multilingual contrastive learning loss function, which, when combined with QLora's efficient fine-tuning technique, enables LLMs to achieve state-of-the-art performance in Sentence Text Similarity (STS) tasks, even with limited computational resources. © 2024 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1382 - Wasi 2024
HRGraph: Leveraging LLMs for HR Data Knowledge Graphs with Information Propagation-based Job Recommendation

Wasi, A. T.

KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():56-62

Association for Computational Linguistics (ACL) 2024

Ref ID: 4372

Knowledge Graphs (KGs) serving as semantic networks, prove highly effective in managing complex interconnected data in different domains, by offering a unified, contextualized, and structured representation with flexibility that allows for easy adaptation to evolving knowledge. Processing complex Human Resources (HR) data, KGs can help in different HR functions like recruitment, job matching, identifying learning gaps, and enhancing employee retention. Despite their potential, limited efforts have been made to implement practical HR knowledge graphs. This study addresses this gap by presenting a framework for effectively developing HR knowledge graphs from documents using Large Language Models. The resulting KG can be used for a variety of downstream tasks, including job matching, identifying employee skill gaps, and many more. In this work, we showcase instances where HR KGs prove instrumental in precise job matching, yielding advantages for both employers and employees. Empirical evidence from experiments with information propagation in KGs and Graph Neural Nets, along with case studies underscores the effectiveness of KGs in tasks such as job and employee recommendations and job area classification. ©2024 Association for Computational Linguistics.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1481 - Wei 2023
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

Wei, Y.; Huang, Q.; Zhang, Y.; Kwok, J. T.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():8667-8683

Association for Computational Linguistics (ACL) 2023

Ref ID: 5103

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based methods alleviate this issue but require costly training for language models and specific finetuning for knowledge graphs, which limits their efficiency. To alleviate these limitations, in this paper, we propose KICGPT, a framework that integrates a large language model (LLM) and a triple-based KGC retriever. It alleviates the long-tail problem without incurring additional training overhead. KICGPT uses an in-context learning strategy called Knowledge Prompt, which encodes structural knowledge into demonstrations to guide the LLM. Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning. © 2023 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#349 - Wei 2023
Improving Bug Severity Prediction With Domain-Specific Representation Learning

Wei, Y.; Zhang, C. F.; Ren, T.

IEEE Access 2023;11():62829-62839

2023

DOI: 10.1109/access.2023.3279205 · Ref ID: 3718

Automating the process of bug severity assignment can accelerate bug triagers' efficiency in the life-cycle of software maintenance, improving the quality of software products. The mainstream approaches for bug severity prediction mainly use different neural networks due to their automated learning ability. However, there are two problems that make existing approaches fail to predict severities for some bugs: 1) they cannot learn the internal knowledge of bug reports; 2) supervised training is difficult to understand the global context of bug reports. To resolve these two problems, in this paper, we propose a bug severity prediction approach, namely KICL, which combines pre-trained language models and domain-specific pre-training strategies, i.e., Knowledge-Intensified pre-training and contrastive learning pre-training. Specifically, Knowledge-Intensified allows KICL to learn project-specific bug report tokens, deeply understanding internal knowledge of bug reports. As for contrastive learning, it allows KICL to perform sequence-level learning, understanding bug reports from the perspective of the global context. When finishing pre-training, we can fine-tune pre-trained KICL for bug severity prediction. To evaluate the effectiveness of KICL, we choose six baseline approaches and compare their performance on a public dataset. The experimental results show that KICL outperforms all baseline approaches by up to 30.68% in terms of weighted average F1-score, achieving new results for bug severity prediction.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3347 - Wei 2024
Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models

Wei, Yifan; Yu, Xiaoyan; Weng, Yixuan; Ma, Huanhuan; Zhang, Yuanzhe; Zhao, Jun; Liu, Kang

arXiv 2024;():

2024

Ref ID: 8573

Large language models encapsulate knowledge and have demonstrated superior performance on various natural language processing tasks. Recent studies have localized this knowledge to specific model parameters, such as the MLP weights in intermediate layers. This study investigates the differences between entity and relational knowledge through knowledge editing. Our findings reveal that entity and relational knowledge cannot be directly transferred or mapped to each other. This result is unexpected, as logically, modifying the entity or the relation within the same knowledge triplet should yield equivalent outcomes. To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models. Contrary to prior research suggesting that knowledge is stored in MLP weights, our experiments demonstrate that relational knowledge is also significantly encoded in attention modules. This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2097 - Weihong 2004
Adding context-awareness to knowledge management in modern enterprises

Weihong, Huang; Ting, Tao

2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791) 2004;2():393-398 Vol.2

2004

DOI: 10.1109/IS.2004.1344779 · Ref ID: 6072

To reduce the negative impact of knowledge loss and to improve knowledge reuse effectiveness in knowledge management in modern enterprises, This work presents a context-aware approach to facilitate managing various types of static enterprise information and dynamic process information. Proposed approach features representing and integrating information at different conceptual levels to present contextual knowledge in an open environment. In this paper, we redefine the concept of context in intelligent systems and propose a set of meta-information elements for context description in business environments. In realising the context-awareness in knowledge management, we present a context knowledge structure model and look into the corresponding context knowledge storage and reuse solutions. To enhance context-aware knowledge management for e-businesses over the global network, we introduce a new concept of context knowledge grid with a layered knowledge interoperation reference model, which are supposed to leverage the contextual knowledge in modern enterprises and enable interoperation with other knowledge frameworks such as the semantic Web and the semantic grid.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1254 - Wen 2024
Enhancing Fault Troubleshooting through Human-Machine Collaboration: A Multi-Stage Reasoning Approach

Wen, S.; Chen, Y.; Pan, X.; Zhuang, W.; Li, X.

IEEE International Conference on Automation Science and Engineering 2024;():460-467

IEEE Computer Society 2024

DOI: 10.1109/CASE59546.2024.10711734 · Ref ID: 4112

Ensuring the stable operation of critical industrial equipment is pivotal for maintaining production efficiency and economic gains. The complexity of modern industrial machinery, however, places a substantial cognitive load on maintenance personnel. To alleviate this, a Diagnostic Semantic-Enhanced Fault Causality Knowledge Graph (DSFCKG) is proposed to formalize fault information for computational analysis. Additionally, a Large Language Model (LLM)-based Knowledge Graph Construction (KGC) method is introduced for the automated assembly of DSFCKG. Building upon this, a multi-stage reasoning approach is designed for human-machine collaborative fault Troubleshooting. Experiments on real-world fault tickets demonstrate that our proposed method significantly enhances fault diagnosis and troubleshooting accuracy, especially in complex scenarios with long fault causal chains, which bring insights into futuristic smart maintenance. © 2024 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1670 - Wen 2021
Named Entity Recognition for Instructions of Chinese Medicine Based on Pre-trained Language Model

Wen, S.; Zeng, B.; Liao, W.

Proceedings - 2021 3rd International Conference on Natural Language Processing, ICNLP 2021 2021;():139-144

Institute of Electrical and Electronics Engineers Inc. 2021

DOI: 10.1109/ICNLP52887.2021.00029 · Ref ID: 5552

Named Entity Recognition (NER) of Chinese medicine text is a basic task of constructing medical and health knowledge graph. Many scholars have researched the NER task of electronic medical records and drug names, while many factors restrict the research of NER tasks for the instructions of Chinese medicine. For example, there is no obvious boundary between words in Chinese, and it is impossible to capture the interactive information between sentences and the global information at the same time. Considering that this type of data is highly professional and there is no publicly available data set. This paper collected 1,000 pieces of instructions of Chinese medicine, then explored the effectiveness of pre-trained models in NER task in this field. The experimental results showed that compared with the experimental results of the single or joint model on the same data set, the F1 value of pre-trained model was increased by 9.65% and 8.71% respectively. © 2021 IEEE.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#2969 - Weng 2012
Symbolic Models and Emergent Models: A Review

Weng, J.

IEEE Transactions on Autonomous Mental Development 2012;4(1):29-53

2012

DOI: 10.1109/TAMD.2011.2159113 · Ref ID: 6096

There exists a large conceptual gap between symbolic models and emergent models for the mind. Many emergent models work on low-level sensory data, while many symbolic models deal with high-level abstract (i.e., action) symbols. There has been relatively little study on intermediate representations, mainly because of a lack of knowledge about how representations fully autonomously emerge inside the closed brain skull, using information from the exposed two ends (the sensory end and the motor end). As reviewed here, this situation is changing. A fundamental challenge for emergent models is abstraction, which symbolic models enjoy through human handcrafting. The term abstract refers to properties disassociated with any particular form. Emergent abstraction seems possible, although the brain appears to never receive a computer symbol (e.g., ASCII code) or produce such a symbol. This paper reviews major agent models with an emphasis on representation. It suggests two different ways to relate symbolic representations with emergent representations: One is based on their categorical definitions. The other considers that a symbolic representation corresponds to a brain's outside behaviors observed and handcrafted by other outside human observers; but an emergent representation is inside the brain.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1588 - White 2023
Leveraging Explicit Procedural Instructions for Data-Efficient Action Prediction

White, J.; Raghuvanshi, A.; Pruksachatkun, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():2895-2904

Association for Computational Linguistics (ACL) 2023

Ref ID: 5184

Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests. While large language models have found success automating these dialogues in constrained environments, their widespread deployment is limited by the substantial quantities of task-specific data required for training. The following paper presents a data-efficient solution to constructing dialogue systems, leveraging explicit instructions derived from agent guidelines, such as company policies or customer service manuals. Our proposed Knowledge-Augmented Dialogue System (KADS) combines a large language model with a knowledge retrieval module that pulls documents outlining relevant procedures from a predefined set of policies, given a user-agent interaction. To train this system, we introduce a semi-supervised pre-training scheme that employs dialogue-document matching and action-oriented masked language modeling with partial parameter freezing. We evaluate the effectiveness of our approach on prominent task-oriented dialogue datasets, Action-Based Conversations Dataset and Schema-Guided Dialogue, for two dialogue tasks: action state tracking and workflow discovery. Our results demonstrate that procedural knowledge augmentation improves accuracy predicting in- and out-of-distribution actions while preserving high performance in settings with low or sparse data. © 2023 Association for Computational Linguistics.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1162 - Winter 2024
DDxGym: Online Transformer Policies in a Knowledge Graph Based Natural Language Environment

Winter, B.; Figueroa, A.; Löser, A.; Gers, F. A.; Figueroa, N.; Krestel, R.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():4438-4448

European Language Resources Association (ELRA) 2024

Ref ID: 4573

Differential diagnosis (DDx) is vital for physicians and challenging due to the existence of numerous diseases and their complex symptoms. Model training for this task is generally hindered by limited data access due to privacy concerns. To address this, we present DDxGym, a specialized OpenAI Gym environment for clinical differential diagnosis. DDxGym formulates DDx as a natural-language-based reinforcement learning (RL) problem, where agents emulate medical professionals, selecting examinations and treatments for patients with randomly sampled diseases. This RL environment utilizes data labeled from online resources, evaluated by medical professionals for accuracy. Transformers, while effective for encoding text in DDxGym, are unstable in online RL. For that reason we propose a novel training method using an auxiliary masked language modeling objective for policy optimization, resulting in model stabilization and significant performance improvement over strong baselines. Following this approach, our agent effectively navigates large action spaces and identifies universally applicable actions. All data, environment details, and implementation, including experiment reproduction code, are made publicly available. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#263 - Wu 2024
Exploring the reversal curse and other deductive logical reasoning in BERT and GPT-based large language models

Wu, D.; Yang, J. Y.; Wang, K.

Patterns 2024;5(9):12

2024

DOI: 10.1016/j.patter.2024.101030 · Ref ID: 3266

The "Reversal Curse"describes the inability of autoregressive decoder large language models (LLMs) to deduce "B is A"from "A is B,"assuming that B and A are distinct and can be uniquely identified from each other. This logical failure suggests limitations in using generative pretrained transformer (GPT) models for tasks like constructing knowledge graphs. Our study revealed that a bidirectional LLM, bidirectional encoder representations from transformers (BERT), does not suffer from this issue. To investigate further, we focused on more complex deductive reasoning by training encoder and decoder LLMs to perform union and intersection operations on sets. While both types of models managed tasks involving two sets, they struggled with operations involving three sets. Our findings underscore the differences between encoder and decoder models in handling logical reasoning. Thus, selecting BERT or GPT should depend on the task's specific needs, utilizing BERT's bidirectional context comprehension or GPT's sequence prediction strengths.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1059 - Wu 2023
Chain of Thought Prompting Elicits Knowledge Augmentation

Wu, D.; Zhang, J.; Huang, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():6519-6534

Association for Computational Linguistics (ACL) 2023

Ref ID: 5220

The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is identified and integrated into deep models. Conventional methods typically employ task-specific approaches to gather external knowledge from various sources. In contrast, large language models are extensively pre-trained and can serve as a comprehensive source of external knowledge. In this paper, we propose CoT-KA, a Chain-of-Thought-based method that augments knowledge for deep learning. CoT-KA avoids the need for additional knowledge retrieval or knowledge reasoning models, as required in conventional augmentation methods. Our results demonstrate that CoT-KA outperforms both pure CoT-based methods and the non-augmented method across the majority of eleven publicly available benchmarks for various reasoning tasks. © 2023 Association for Computational Linguistics.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1119 - Wu 2023
CONIC10K: A Challenging Math Problem Understanding and Reasoning Dataset

Wu, H.; Hui, W.; Chen, Y.; Wu, W.; Tu, K.; Zhou, Y.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():6444-6458

Association for Computational Linguistics (ACL) 2023

Ref ID: 5084

Mathematical understanding and reasoning are crucial tasks for assessing the capabilities of artificial intelligence (AI). However, existing benchmarks either require just a few steps of reasoning, or only contain a small amount of data in one specific topic, making it hard to analyse AI's behaviour with reference to different problems within a specific topic in detail. In this work, we propose CONIC10K, a challenging math problem dataset on conic sections in Chinese senior high school education. Our dataset contains various problems with different reasoning depths, while only the knowledge from conic sections is required. Since the dataset only involves a narrow range of knowledge, it is easy to separately analyse the knowledge a model possesses and the reasoning ability it has. For each problem, we provide a high-quality formal representation, the reasoning steps, and the final solution. Experiments show that existing large language models, including GPT-4, exhibit weak performance on complex reasoning. We hope that our findings could inspire more advanced techniques for precise natural language understanding and reasoning. Our dataset and codes are available at https://github.com/whyNLP/Conic10K. © 2023 Association for Computational Linguistics.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#1090 - Wu 2024
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind

Wu, J.; Chen, Z.; Deng, J.; Sabour, S.; Meng, H.; Huang, M.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():15984-16007

Association for Computational Linguistics (ACL) 2024

Ref ID: 4364

Theory of mind (ToM) refers to humans' ability to understand and infer the desires, beliefs, and intentions of others. The acquisition of ToM plays a key role in humans' social cognition and interpersonal relations. Though indispensable for social intelligence, ToM is still lacking for modern AI and NLP systems since they cannot access the human mental state and cognitive process beneath the training corpus. To empower AI systems with the ToM ability and narrow the gap between them and humans, in this paper, we propose COKE: the first cognitive knowledge graph for machine theory of mind, formalizing cognitive processes as a chained structure. Specifically, COKE formalizes ToM as a collection of 45k+ manually verified cognitive chains that characterize human mental activities and subsequent behavioral/affective responses when facing specific social circumstances. In addition, we further generalize COKE using LLMs and build a powerful generation model COLM tailored for cognitive reasoning. Experimental results in both automatic and human evaluation demonstrate the high quality of COKE, the superior ToM ability of COLM, and its potential to significantly enhance social applications. We release our code and data at https://github.com/jincenziwu/COKE. © 2024 Association for Computational Linguistics.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1818 - Wu 2024
Research Progress on Digitalization and Intelligence in Food Domain Based on Knowledge Graphs

Wu, J.; Li, L.; Wu, Z.; Yu, C.; Cheng, J.; Zeng, X.; Zhao, X.; Yang, Y.; Ma, J.

J. Food Sci. Technol. 2024;42(5):24-32

2024

DOI: 10.12301/spxb202400610 · Ref ID: 3871

With the development of technologies such as big data and cloud computing, the scale of data in the food domain is growing at an astonishing rate. These data not only come from diverse sources and have complex structures, but also lack standardized terminology, which poses challenges to the effective integration and utilization of food-related data. Knowledge graphs, as fundamental cornerstone of achieving general artificial intelligence, provides support for the organization and management of food data and its higher-level applications in terms of integration and semantic understanding. By summarizing recent research achievements of knowledge graphs in the food domain, the construction methods of knowledge graphs in food domain was reviewed, covering key steps such as ontology construction, knowledge extraction, knowledge fusion, and processing. The current applications of knowledge graphs in the food domain, particularly in three areas, food nutrition and health, food innovation and research, and food safety and traceability. Based on current state of development of knowledge graphs in the food domain, incorporating multimodal data fusion technology, large language model construction, and the intelligentization of industrial equipment in the food field, the future development directions of knowledge graphs in food domain were anticipated. © 2024 Beijing Technology and Business University, Department of Science and Technology. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#312 - Wu 2024
Geospatial Big Data: Survey and Challenges

Wu, J. Y.; Gan, W. S.; Chao, H. C.; Yu, P. S.

IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2024;17():17007-17020

2024

DOI: 10.1109/jstars.2024.3438376 · Ref ID: 3401

In recent years, geospatial big data (GBD) has obtained attention across various disciplines, categorized into big Earth observation data and big human behavior data. Identifying geospatial patterns from GBD has been a vital research focus in the fields of urban management and environmental sustainability. This article reviews the evolution of GBD mining and its integration with advanced artificial intelligence techniques. GBD consists of data generated by satellites, sensors, mobile devices, and geographical information systems, and we categorize geospatial data based on different perspectives. We outline the process of GBD mining and demonstrate how it can be incorporated into a unified framework. In addition, we explore new technologies, such as large language models, the metaverse, and knowledge graphs, and how they could make GBD even more useful. We also share examples of GBD helping with city management and protecting the environment. Finally, we discuss the real challenges that come up when working with GBD, such as issues with data retrieval and security. Our goal is to give readers a clear view of where GBD mining stands today and where it might go next.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2044 - Wu 2023
Zero-Shot Construction of Chinese Medical Knowledge Graph with ChatGPT

Wu, L. I.; Li, G.

Proceedings - 2023 1st IEEE International Conference on Medical Artificial Intelligence, MedAI 2023 2023;():278-283

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/MedAI59581.2023.00043 · Ref ID: 4959

Knowledge graphs have revolutionized the organization and retrieval of real-world knowledge, prompting inter-est in automatic NLP-based approaches for extracting medical knowledge from texts. However, the availability of high-quality Chinese medical knowledge remains limited, posing challenges for constructing Chinese medical knowledge graphs. As LLMs like ChatGPT show promise in zero-shot learning for many NLP downstream tasks, their potential on constructing Chinese medical knowledge graphs is still uncertain. In this study, we create a Chinese medical knowledge graph by manually annotating textual data and using ChatGPT to automatically generate the graph. We refine the results using filtering and mapping rules to align with our schema. The manually generated graph serves as the ground truth for evaluation, and we explore different methods to enhance its accuracy through knowledge graph completion techniques. As a result, we emphasize the potential of employing ChatGPT for automated knowledge graph construction within the Chinese medical domain. While ChatGPT successfully identifies a larger number of entities, further en-hancements are required to improve its performance in extracting more qualified relations. © 2023 IEEE.

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#1807 - Wu 2023
Research on Intelligent Question-Answering Systems Based on Large Language Models and Knowledge Graphs

Wu, Q.; Wang, Y.

Proceedings - 2023 16th International Symposium on Computational Intelligence and Design, ISCID 2023 2023;():161-164

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ISCID59865.2023.00045 · Ref ID: 4938

With the continuous development of artificial intelligence and cloud computing technologies, the emergence of large language models (LLMs) has created new opportunities for intelligent applications. However, large language models may lack authenticity and accuracy when providing answers in specific professional domains, and they even generate "illusory facts." In response to the limitations of current large language models in solving specific professional fields, this paper proposes to use large language models and knowledge graph technology to construct an intelligent question answering system for specific fields. Through systematic training and optimization, efficient domain specific knowledge Q&A has been achieved, improving the satisfaction rate of domain specific knowledge Q&A. The intelligent question answering system based on large models and knowledge graphs brings more convenience to people's lives and work, which is beneficial for users to obtain intelligent solutions in fields such as education, healthcare, and customer service. ©2023 IEEE.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1173 - Wu 2023
DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models

Wu, X.; Li, J.; Xu, M.; Dong, W.; Wu, S.; Bian, C.; Xiong, D.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():2875-2886

Association for Computational Linguistics (ACL) 2023

Ref ID: 5100

Large language models pretrained on a huge amount of data capture rich knowledge and information in the training data. The ability of data memorization and regurgitation in pretrained language models, revealed in previous studies, brings the risk of data leakage. In order to effectively reduce these risks, we propose a framework DEPN to Detect and Edit Privacy Neurons in pretrained language models, partially inspired by knowledge neurons and model editing. In DEPN, we introduce a novel method, termed as privacy neuron detector, to locate neurons associated with private information, and then edit these detected privacy neurons by setting their activations to zero. Furthermore, we propose a privacy neuron aggregator dememorize private information in a batch processing manner. Experimental results show that our method can significantly and efficiently reduce the exposure of private data leakage without deteriorating the performance of the model. Additionally, we empirically demonstrate the relationship between model memorization and privacy neurons, from multiple perspectives, including model size, training time, prompts, privacy neuron distribution, illustrating the robustness of our approach. ©2023 Association for Computational Linguistics.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#2076 - Wu 2024
reguloGPT: Harnessing GPT for Knowledge Graph Construction of Molecular Regulatory Pathways

Wu, X.; Zeng, Y.; Das, A.; Jo, S.; Zhang, T.; Patel, P.; Zhang, J.; Gao, S. J.; Pratt, D.; Chiu, Y. C.; Huang, Y.

bioRxiv 2024;():

2024

DOI: 10.1101/2024.01.27.577521 · Ref ID: 5886

MOTIVATION: Molecular Regulatory Pathways (MRPs) are crucial for understanding biological functions. Knowledge Graphs (KGs) have become vital in organizing and analyzing MRPs, providing structured representations of complex biological interactions. Current tools for mining KGs from biomedical literature are inadequate in capturing complex, hierarchical relationships and contextual information about MRPs. Large Language Models (LLMs) like GPT-4 offer a promising solution, with advanced capabilities to decipher the intricate nuances of language. However, their potential for end-to-end KG construction, particularly for MRPs, remains largely unexplored. RESULTS: We present reguloGPT, a novel GPT-4 based in-context learning prompt, designed for the end-to-end joint name entity recognition, N-ary relationship extraction, and context predictions from a sentence that describes regulatory interactions with MRPs. Our reguloGPT approach introduces a context-aware relational graph that effectively embodies the hierarchical structure of MRPs and resolves semantic inconsistencies by embedding context directly within relational edges. We created a benchmark dataset including 400 annotated PubMed titles on N6-methyladenosine (m(6)A) regulations. Rigorous evaluation of reguloGPT on the benchmark dataset demonstrated marked improvement over existing algorithms. We further developed a novel G-Eval scheme, leveraging GPT-4 for annotation-free performance evaluation and demonstrated its agreement with traditional annotation-based evaluations. Utilizing reguloGPT predictions on m(6)A-related titles, we constructed the m(6)A-KG and demonstrated its utility in elucidating m(6)A's regulatory mechanisms in cancer phenotypes across various cancers. These results underscore reguloGPT's transformative potential for extracting biological knowledge from the literature. AVAILABILITY AND IMPLEMENTATION: The source code of reguloGPT, the m(6)A title and benchmark datasets, and m(6)A-KG are available at: https://github.com/Huang-AI4Medicine-Lab/reguloGPT.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#3842 - Wu 2023
Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering

Wu, Yike; Hu, Nan; Bi, Sheng; Qi, Guilin; Ren, Jie; Xie, Anhuan; Song, Wei

arXiv 2023;():

2023

Ref ID: 7836

Despite their competitive performance on knowledge-intensive tasks, large language models (LLMs) still have limitations in memorizing all world knowledge especially long tail knowledge. In this paper, we study the KG-augmented language model approach for solving the knowledge graph question answering (KGQA) task that requires rich world knowledge. Existing work has shown that retrieving KG knowledge to enhance LLMs prompting can significantly improve LLMs performance in KGQA. However, their approaches lack a well-formed verbalization of KG knowledge, i.e., they ignore the gap between KG representations and textual representations. To this end, we propose an answer-sensitive KG-to-Text approach that can transform KG knowledge into well-textualized statements most informative for KGQA. Based on this approach, we propose a KG-to-Text enhanced LLMs framework for solving the KGQA task. Experiments on several KGQA benchmarks show that the proposed KG-to-Text augmented LLMs approach outperforms previous KG-augmented LLMs approaches regarding answer accuracy and usefulness of knowledge statements.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3299 - Wu 2024
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering

Wu, Yike; Huang, Yi; Hu, Nan; Hua, Yuncheng; Qi, Guilin; Chen, Jiaoyan; Pan, Jeff Z.

arXiv 2024;():

2024

Ref ID: 8638

Recent studies have explored the use of Large Language Models (LLMs) with Retrieval Augmented Generation (RAG) for Knowledge Graph Question Answering (KGQA). They typically require rewriting retrieved subgraphs into natural language formats comprehensible to LLMs. However, when tackling complex questions, the knowledge rewritten by existing methods may include irrelevant information, omit crucial details, or fail to align with the question's semantics. To address them, we propose a novel rewriting method CoTKR, Chain-of-Thought Enhanced Knowledge Rewriting, for generating reasoning traces and corresponding knowledge in an interleaved manner, thereby mitigating the limitations of single-step knowledge rewriting. Additionally, to bridge the preference gap between the knowledge rewriter and the question answering (QA) model, we propose a training strategy PAQAF, Preference Alignment from Question Answering Feedback, for leveraging feedback from the QA model to further optimize the knowledge rewriter. We conduct experiments using various LLMs across several KGQA benchmarks. Experimental results demonstrate that, compared with previous knowledge rewriting methods, CoTKR generates the most beneficial knowledge representation for QA models, which significantly improves the performance of LLMs in KGQA.

Kwesi voted
yuexi voted
Final decision
What was the agreed final decision?

#3746 - Wu 2023
Online Continual Knowledge Learning for Language Models

Wu, Yuhao; Shi, Tongjun; Sharma, Karthick; Seah, Chun Wei; Zhang, Shuhao

arXiv 2023;():

2023

Ref ID: 7948

Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dynamic nature of world knowledge in LMs under real-time constraints. We propose a new benchmark and evaluation metric designed to measure both the rate of new knowledge acquisition and the retention of previously learned knowledge. Our empirical evaluation, conducted using a variety of state-of-the-art methods, establishes robust base-lines for OCKL. Our results reveal that existing continual learning approaches are unfortunately insufficient for tackling the unique challenges posed by OCKL. We identify key factors that influence the trade-off between knowledge acquisition and retention, thereby advancing our understanding of how to train LMs in a continually evolving environment.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3574 - Wu 2024
KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment

Wu, Zongzong; Tang, Fengxiao; Zhao, Ming; Li, Yufeng

arXiv 2024;():

2024

Ref ID: 8536

Cyber threat intelligence is a critical tool that many organizations and individuals use to protect themselves from sophisticated, organized, persistent, and weaponized cyber attacks. However, few studies have focused on the quality assessment of threat intelligence provided by intelligence platforms, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose a knowledge graph-based verifier, a novel Cyber Threat Intelligence (CTI) quality assessment framework that combines knowledge graphs and Large Language Models (LLMs). Our approach introduces LLMs to automatically extract OSCTI key claims to be verified and utilizes a knowledge graph consisting of paragraphs for fact-checking. This method differs from the traditional way of constructing complex knowledge graphs with entities as nodes. By constructing knowledge graphs with paragraphs as nodes and semantic similarity as edges, it effectively enhances the semantic understanding ability of the model and simplifies labeling requirements. Additionally, to fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources. To the best of our knowledge, this work is the first to create a dataset on threat intelligence reliability verification, providing a reference for future research. Experimental results show that KGV (Knowledge Graph Verifier) significantly improves the performance of LLMs in intelligence quality assessment. Compared with traditional methods, we reduce a large amount of data annotation while the model still exhibits strong reasoning capabilities. Finally, our method can achieve XXX accuracy in network threat assessment.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#3362 - Xi 2024
Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models

Xi, Yunjia; Liu, Weiwen; Lin, Jianghao; Weng, Muyan; Cai, Xiaoling; Zhu, Hong; Zhu, Jieming; Chen, Bo; Tang, Ruiming; Yu, Yong; Zhang, Weinan

arXiv 2024;():

2024

Ref ID: 8547

Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1856 - Xia 2023
Secure Co-Creation of Industrial Knowledge Graph: Graph Complement Method with Federated Learning and ChatGPT

Xia, L.; Zheng, P.; Liang, Y.; Zheng, G.; Ling, Z.

IEEE International Conference on Automation Science and Engineering 2023;2023-August():

IEEE Computer Society 2023

DOI: 10.1109/CASE56687.2023.10260382 · Ref ID: 5241

Industrial areas have increasingly developed their own Knowledge Graph (KG) for organizing and leveraging vast amounts of data. One major challenge in constructing KG is the heavy reliance on available resources, restricting the scalability and accuracy of the resulting graphs. To address this issue, an end-to-end method is proposed to create a multi-benefit ecosystem by integrating Federated Learning with ChatGPT (a popular language model). Different stakeholders may leverage ChatGPT to search for novel knowledge that complements their existing KGs, however, this approach could potentially introduce ambiguous and wrong triples into the KG. To overcome this, Federated Learning is applied to align and disambiguate the triples using other industrial KGs as super-vision. The proposed method applies a multi-field hyperbolic embedding method to vectorize entities and edges, which are then associatively aggregated to achieve edge replenishment and entity fusion for each KG encrypted. Finally, an incentive win-win mechanism is proposed to motivate diverse stakeholders to contribute to this co-creation actively. A case study is conducted on different industrial KG to evaluate the proposed method. Results demonstrate that this method provides a practical solution for KG co-creation and no compromise to data security. © 2023 IEEE.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3515 - Xia 2024
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning

Xia, Tianle; Ding, Liang; Wan, Guojia; Zhan, Yibing; Du, Bo; Tao, Dacheng

arXiv 2024;():

2024

Ref ID: 8272

Answering complex queries over incomplete knowledge graphs (KGs) is a challenging job. Most previous works have focused on learning entity/relation embeddings and simulating first-order logic operators with various neural networks. However, they are bottlenecked by the inability to share world knowledge to improve logical reasoning, thus resulting in suboptimal performance. In this paper, we propose a complex reasoning schema over KG upon large language models (LLMs), containing a curriculum-based logical-aware instruction tuning framework, named LACT. Specifically, we augment the arbitrary first-order logical queries via binary tree decomposition, to stimulate the reasoning capability of LLMs. To address the difficulty gap among different types of complex queries, we design a simple and flexible logic-aware curriculum learning framework. Experiments across widely used datasets demonstrate that LACT has substantial improvements (brings an average +5.5% MRR score) over advanced methods, achieving the new state-of-the-art. Our code and model will be released at GitHub and huggingface soon.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1060 - Xia 2024
Chain-of-History Reasoning for Temporal Knowledge Graph Forecasting

Xia, Y.; Wang, D.; Liu, Q.; Wang, L.; Wu, S.; Zhang, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():16144-16159

Association for Computational Linguistics (ACL) 2024

Ref ID: 4257

Temporal Knowledge Graph (TKG) forecasting aims to predict future facts based on given histories. Most recent graph-based models excel at capturing structural information within TKGs but lack semantic comprehension abilities. Nowadays, with the surge of LLMs, the LLM-based TKG prediction model has emerged. However, the existing LLM-based model exhibits three shortcomings: (1) It only focuses on the first-order history for prediction while ignoring high-order historical information, resulting in the provided information for LLMs being extremely limited. (2) LLMs struggle with optimal reasoning performance under heavy historical information loads. (3) For TKG prediction, the temporal reasoning capability of LLM alone is limited. To address the first two challenges, we propose Chain-of-History (CoH) reasoning which explores high-order histories step-by-step, achieving effective utilization of high-order historical information for LLMs on TKG prediction. To address the third issue, we design CoH as a plug-and-play module to enhance the performance of graph-based models for TKG prediction. Extensive experiments on three datasets and backbones demonstrate the effectiveness of CoH. © 2024 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1156 - Xie 2020
Cyber security entity recognition method based on residual dilation convolution neural network

Xie, B.; Shen, G. W.; Guo, C.; Zhou, Y.; Yu, M.

Ch. J. Netw. Inf. Secur. 2020;6(5):126-138

2020

DOI: 10.11959/j.issn.2096-109x.2020009 · Ref ID: 5756

In recent years, cybersecurity threats have increased, and data-driven security intelligence analysis has become a hot research topic in the field of cybersecurity. In particular, the artificial intelligence technology represented by the knowledge graph can provide support for complex cyberattack detection and unknown cyberattack detection in multi-source heterogeneous threat intelligence data. Cybersecurity entity recognition is the basis for the construction of threat intelligence knowledge graphs. The composition of security entities in open network text data is very complex, which makes traditional deep learning methods difficult to identify accurately. Based on the pre-training language model of BERT (pre-training of deep bidirectional transformers), a cybersecurity entity recognition model BERT-RDCNN-CRF based on residual dilation convolutional neural network and conditional random field was proposed. The BERT model was used to train the character-level feature vector representation. Combining the residual convolution and the dilation neural network model to effectively extract the important features of the security entity, and finally obtain the BIO annotation of each character through CRF. Experiments on the large-scale cybersecurity entity annotation dataset constructed show that the proposed method achieves better results than the LSTM-CRF model, the BiLSTM-CRF model and the traditional entity recognition model. © 2020, Beijing Xintong Media Co., Ltd.. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#880 - Xie 2021
WebKE: Knowledge Extraction from Semi-structured Web with Pre-trained Markup Language Model

Xie, C. H.; Huang, W. H.; Liang, J. Q.; Huang, C. S.; Xiao, Y. H.; Acm

30th ACM International Conference on Information and Knowledge Management (CIKM) 2021;():2211-2220

Univ Queensland, ELECTR NETWORK Assoc Computing Machinery 2021

DOI: 10.1145/3459637.3482491 · Ref ID: 3242

The World Wide Web contains rich up-to-date information for knowledge graph construction. However, most current relation extraction techniques are designed for free text and thus do not handle well semi-structured web content. In this paper, we propose a novel multi-phase machine reading framework, called WebKE. It processes the web content on different granularity by first detecting areas of interest at DOM tree node level and then extracting relational triples for each area. We also propose HTMLBERT as an encoder the web content. It is a pre-trained markup language model that fully leverages the visual layout information and DOM-tree structure, without the need of hand engineered features. Experimental results show that the proposed approach outperforms state-of-the-art methods by a considerable gain. The source code is available at https://github.com/redreamality/webke.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3707 - Xie 2024
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation

Xie, Jiakuan; Cao, Pengfei; Chen, Yuheng; Chen, Yubo; Liu, Kang; Zhao, Jun

arXiv 2024;():

2024

Ref ID: 8396

Knowledge editing aims to adjust the knowledge within large language models (LLMs) to prevent their responses from becoming obsolete or inaccurate. However, existing works on knowledge editing are primarily conducted in a single language, which is inadequate for multilingual language models. In this paper, we focus on multilingual knowledge editing (MKE), which requires propagating updates across multiple languages. This necessity poses a significant challenge for the task. Furthermore, the limited availability of a comprehensive dataset for MKE exacerbates this challenge, hindering progress in this area. Hence, we introduce the Multilingual Knowledge Editing Benchmark (MKEB), a novel dataset comprising 12 languages and providing a complete evaluation framework. Additionally, we propose a method that enhances Multilingual knowledge Editing with neuron-Masked Low-Rank Adaptation (MEMLA). Specifically, we identify two categories of knowledge neurons to improve editing precision. Moreover, we perform LoRA-based editing with neuron masks to efficiently modify parameters and facilitate the propagation of updates across multiple languages. Experiments demonstrate that our method outperforms existing baselines and significantly enhances the multi-hop reasoning capability of the edited model, with minimal impact on its downstream task performance. The dataset and code will be made publicly available.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#118 - Xie 2024
Combining prompt learning with contextual semantics for inductive relation prediction

Xie, S. R.; Pan, Q. F.; Wang, X. Z.; Luo, X. F.; Sugumaran, V.

Expert Syst. Appl. 2024;238():12

2024

DOI: 10.1016/j.eswa.2023.121669 · Ref ID: 3458

Inductive relation prediction for knowledge graphs aims to predict missing relations between two new entities. Most previous studies on relation prediction are limited to the transductive setting and could not be applied to it. Recently, some inductive methods have been proposed to handle it by learning the topological semantics. However, they solely rely on structural information, disregarding the role of prior knowledge. In cases of sparse structures, this limitation is magnified, thereby hindering the inductive ability. Prior knowledge can not only filter out invalid topological structures but also complement the topological semantics. To this end, We propose a novel inductive model, PLCS, which incorporates prompt learning with contextual semantics to fully exploit prior knowledge. To filter out irrelevant topological structures, we innovatively employ hard prompts to mine prior knowledge in pre-trained language models (PLMs) as the basis for subgraph extraction. Additionally, we enhance semantic representation by integrating relation text descriptions into relation embeddings during initialization, supplementing topological semantics. The experimental results on four benchmark datasets show the superiority of PLCS over existing state-of-the-art methods.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#1148 - Xie 2024
Creation of a structured solar cell material dataset and performance prediction using large language models

Xie, T.; Wan, Y.; Zhou, Y.; Huang, W.; Liu, Y.; Linghu, Q.; Wang, S.; Kit, C.; Grazian, C.; Zhang, W.; Hoex, B.

Patterns 2024;5(5):

2024

DOI: 10.1016/j.patter.2024.100955 · Ref ID: 4095

Materials scientists usually collect experimental data to summarize experiences and predict improved materials. However, a crucial issue is how to proficiently utilize unstructured data to update existing structured data, particularly in applied disciplines. This study introduces a new natural language processing (NLP) task called structured information inference (SII) to address this problem. We propose an end-to-end approach to summarize and organize the multi-layered device-level information from the literature into structured data. After comparing different methods, we fine-tuned LLaMA with an F1 score of 87.14% to update an existing perovskite solar cell dataset with articles published since its release, allowing its direct use in subsequent data analysis. Using structured information, we developed regression tasks to predict the electrical performance of solar cells. Our results demonstrate comparable performance to traditional machine-learning methods without feature selection and highlight the potential of large language models for scientific knowledge acquisition and material development. © 2024 The Author(s)

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#3315 - Xie 2023
DARWIN Series: Domain Specific Large Language Models for Natural Science

Xie, Tong; Wan, Yuwei; Huang, Wei; Yin, Zhenyu; Liu, Yixuan; Wang, Shaozhou; Linghu, Qingyuan; Kit, Chunyu; Grazian, Clara; Zhang, Wenjie; Razzak, Imran; Hoex, Bram

arXiv 2023;():

2023

Ref ID: 7816

Emerging tools bring forth fresh approaches to work, and the field of natural science is no different. In natural science, traditional manual, serial, and labour-intensive work is being augmented by automated, parallel, and iterative processes driven by artificial intelligence-based experimental automation and more. To add new capabilities in natural science, enabling the acceleration and enrichment of automation of the discovery process, we present DARWIN, a series of tailored LLMs for natural science, mainly in physics, chemistry, and material science. This series relies on open-source LLM, incorporating structured and unstructured scientific knowledge from public datasets and literature. We fine-tuned the models using over 60,000 instruction data points, emphasizing factual correctness. During the fine-tuning, we introduce the Scientific Instruction Generation (SIG) model, automating instruction generation from scientific texts. This eliminates the need for manual extraction or domain-specific knowledge graphs and efficiently injects scientific knowledge into the model. We also explore multi-task training strategies, revealing interconnections between scientific tasks. DARWIN series not only achieves state-of-the-art results on various scientific tasks but also diminishes reliance on closed-source AI models. Our research showcases the ability of LLM in the scientific domain, with the overarching goal of fostering prosperity within the broader AI for science community.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1703 - Xin 2023
Online Knowledge Fusion Method for Fault Diagnosis of Power Plant Equipment

Xin, Y.; Chen, L.; Yang, Y.

IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2023;():1236-1240

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ITAIC58329.2023.10408849 · Ref ID: 4994

There are many types of documents in fossil-fuel power station to describe equipment failures, including maintenance records, treatment diagnosis suggestions, historical cases, and equipment knowledge. The knowledge of equipment anomaly diagnosis and handling is scattered in different documents. Extract and fuse scattered knowledge from these scattered documents to generate a knowledge graph for equipment fault di-agnosis, providing necessary decision support for maintenance personnel to discover and handle equipment faults. This article proposes an implementation method for extracting and integrating equipment fault knowledge from diverse and multi type text records to form a knowledge graph. A thermal power plant equipment fault Q&A system based on the fusion of open source large language models and knowledge graphs has been developed. Main contributions: (1) A knowledge extraction algorithm integrating BERT-WWM model and pointer annotation method is proposed to extract entity relations of fault text jointly. Experiments show that the method performs well in extracting overlapped triples, and F1 is improved by 8.51 % compared with existing algorithms; (2) A knowledge fusion model based on RoBERTa-BiLSTM is proposed, which fully utilizes the feature information of the entity text to be disambiguated and the entity mention text, and cap-tures the interdependent features within the sentence through attention mechanism. The experiment shows that this method improves F1 by 9.56% compared to existing fusion algorithms. (3) Based on the open-source large model ChatGLM, a fusion method of knowledge graph and large model ChatGLM was explored, and a device fault question answering system for thermal power plants was implemented, achieving high accuracy in practical applications. © 2023 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1798 - Xing 2020
Relation extraction using language model based on knowledge graph

Xing, C.; Liu, X.; Du, D.; Hu, W.; Zhang, M.

Journal of Physics: Conference Series 2020;1624():

IOP Publishing Ltd 2020

DOI: 10.1088/1742-6596/1624/2/022037 · Ref ID: 5698

Relation extraction is an important task in natural language processing (NLP). The existing methods generally pay more attention on extracting textual semantic information from text, but ignore the relation contextual information from existed relations in datasets, which is very important for the performance of relation extraction task. In this paper, we represent each individual entity as a embedding based on entities and relations knowledge graph, which encodes the relation contextual information between the given entity pairs and relations. Besides, inspired by the impressive performance of language models recently, we used the language model to leverage word semantic information, in which word semantic information can be better captured than word embedding. The experimental results on SemEval2010 Task 8 dataset showed that the F1-score of our proposed method improved nearly 3% compared with the previous methods. © 2020 Institute of Physics Publishing. All rights reserved.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#54 - Xiong 2022
AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL

Xiong, G. M.; Bao, J. W.; Zhao, W.; Wu, Y. Z.; He, X. D.; Acm

31st ACM International Conference on Information and Knowledge Management (CIKM) 2022;():2250-2259

Atlanta, GA Assoc Computing Machinery 2022

DOI: 10.1145/3511808.3557246 · Ref ID: 3435

This study investigates the task of knowledge-based question generation (KBQG). Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. Moreover, due to the costly annotation of large-scale SPARQL-question pairs, KBQG from SPARQL under low-resource scenarios urgently needs to be explored. Recently, since the generative pre-trained language models (PLMs) typically trained in natural language (NL)-to-NL paradigm have been proven effective for low-resource generation, e.g., T5 and BART, how to effectively utilize them to generate NL-question from non-NL SPARQL is challenging. To address these challenges, AutoQGS, an auto-prompt approach for low-resource KBQG from SPARQL, is proposed. Firstly, we put forward to generate questions directly from SPARQL for the KBQG task to handle complex operations. Secondly, we propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description, smoothing the low-resource transformation from non-NL SPARQL to NL question with PLMs. Experimental results on the WebQuestionsSP, ComlexWebQuestions 1.1, and PathQuestions show that our model achieves state-of-the-art performance, especially in low-resource settings. Furthermore, a corpus of 330k factoid complex question-SPARQL pairs is generated for further KBQG research.(1)

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3292 - Xiong 2024
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents

Xiong, Haoyi; Wang, Zhiyuan; Li, Xuhong; Bian, Jiang; Xie, Zeke; Mumtaz, Shahid; Al-Dulaimi, Anwer; Barnes, Laura E.

arXiv 2024;():

2024

Ref ID: 8453

This article explores the convergence of connectionist and symbolic artificial intelligence (AI), from historical debates to contemporary advancements. Traditionally considered distinct paradigms, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs), exemplified by ChatGPT and GPT-4, highlight the potential of connectionist architectures in handling human language as a form of symbols. The study argues that LLM-empowered Autonomous Agents (LAAs) embody this paradigm convergence. By utilizing LLMs for text-based knowledge modeling and representation, LAAs integrate neuro-symbolic AI principles, showcasing enhanced reasoning and decision-making capabilities. Comparing LAAs with Knowledge Graphs within the neuro-symbolic AI theme highlights the unique strengths of LAAs in mimicking human-like reasoning processes, scaling effectively with large datasets, and leveraging in-context samples without explicit re-training. The research underscores promising avenues in neuro-vector-symbolic integration, instructional encoding, and implicit reasoning, aimed at further enhancing LAA capabilities. By exploring the progression of neuro-symbolic AI and proposing future research trajectories, this work advances the understanding and development of AI technologies.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3912 - Xu 2024
Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing

Xu, Aobo; Chang, Bingyu; Liu, Qingpeng; Jian, Ling

arXiv 2024;():

2024

Ref ID: 8483

Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP OAG-Challenge PST track, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3285 - Xu 2024
Context Graph

Xu, Chengjin; Li, Muzhi; Yang, Cehao; Jiang, Xuhui; Tang, Lumingyuan; Qi, Yiyan; Guo, Jian

arXiv 2024;():

2024

Ref ID: 8392

Knowledge Graphs (KGs) are foundational structures in many AI applications, representing entities and their interrelations through triples. However, triple-based KGs lack the contextual information of relational knowledge, like temporal dynamics and provenance details, which are crucial for comprehensive knowledge representation and effective reasoning. Instead, ??????? ?????? (CGs) expand upon the conventional structure by incorporating additional information such as time validity, geographic location, and source provenance. This integration provides a more nuanced and accurate understanding of knowledge, enabling KGs to offer richer insights and support more sophisticated reasoning processes. In this work, we first discuss the inherent limitations of triple-based KGs and introduce the concept of CGs, highlighting their advantages in knowledge representation and reasoning. We then present a context graph reasoning ???³ paradigm that leverages large language models (LLMs) to retrieve candidate entities and related contexts, rank them based on the retrieved information, and reason whether sufficient information has been obtained to answer a query. Our experimental results demonstrate that CGR³ significantly improves performance on KG completion (KGC) and KG question answering (KGQA) tasks, validating the effectiveness of incorporating contextual information on KG representation and reasoning.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3109 - Xu 2024
Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research

Xu, Haowen; Li, Xueping; Tupayachi, Jose; Lian, Jianming Jamie; Omitaomu, Olufemi A.

Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI 2024;():43–49

Atlanta, GA, USA Association for Computing Machinery 2024

DOI: 10.1145/3681780.3697252 · Ref ID: 7292

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3862 - Xu 2023
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks

Xu, Shicheng; Pang, Liang; Shen, Huawei; Cheng, Xueqi; Chua, Tat-Seng

arXiv 2023;():

2023

Ref ID: 7684

Making the content generated by Large Language Model (LLM), accurate, credible and traceable is crucial, especially in complex knowledge-intensive tasks that require multi-step reasoning and each step needs knowledge to solve. Retrieval-augmented generation is good potential to solve this problem. However, where and how to introduce Information Retrieval (IR) to LLM is a big challenge. Previous work has the problems that wrong knowledge retrieved by IR misleads the LLM and interaction between IR and LLM breaks the reasoning chain of LLM. This paper proposes a novel framework named ??????-??-???-????? (SearChain) for the interaction between LLM and IR to solve the challenges. First, LLM generates the reasoning chain named Chain-of-Query (CoQ) where each node consists of an IR-oriented query-answer pair. Second, IR verifies the answer of each node of CoQ. It corrects the answer that is not consistent with the retrieved information when IR gives high confidence, which improves the credibility. Third, LLM can indicate its missing knowledge in CoQ and rely on IR to provide this knowledge to LLM. These operations improve the accuracy in terms of reasoning and knowledge. Finally, SearChain generates the reasoning process and marks references to supporting documents for each reasoning step, which improves traceability. Interaction with IR in SearChain forms a novel reasoning path based on a tree, which enables LLM to dynamically modify the direction of reasoning. Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks including multi-hop Q&amp;A, slot filling, fact checking, and long-form Q&amp;A.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#433 - Xu 2024
Knowledge graph construction for heart failure using large language models with prompt engineering

Xu, T. H.; Gu, Y. X.; Xue, M. T.; Gu, R. J.; Li, B.; Gu, X.

Front. Comput. Neurosci. 2024;18():16

2024

DOI: 10.3389/fncom.2024.1389475 · Ref ID: 3023

Introduction Constructing an accurate and comprehensive knowledge graph of specific diseases is critical for practical clinical disease diagnosis and treatment, reasoning and decision support, rehabilitation, and health management. For knowledge graph construction tasks (such as named entity recognition, relation extraction), classical BERT-based methods require a large amount of training data to ensure model performance. However, real-world medical annotation data, especially disease-specific annotation samples, are very limited. In addition, existing models do not perform well in recognizing out-of-distribution entities and relations that are not seen in the training phase.Method In this study, we present a novel and practical pipeline for constructing a heart failure knowledge graph using large language models and medical expert refinement. We apply prompt engineering to the three phases of schema design: schema design, information extraction, and knowledge completion. The best performance is achieved by designing task-specific prompt templates combined with the TwoStepChat approach.Results Experiments on two datasets show that the TwoStepChat method outperforms the Vanillia prompt and outperforms the fine-tuned BERT-based baselines. Moreover, our method saves 65% of the time compared to manual annotation and is better suited to extract the out-of-distribution information in the real world.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3026 - Xu 2010
Towards intelligent query processing based on Attribute-Oriented Generalization

Xu, X.; Jiandong, Yang

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery 2010;5():2026-2030

2010

DOI: 10.1109/FSKD.2010.5569667 · Ref ID: 6644

Due to the intrinsic characteristics of the relational model, the standard database user interface forces users to be familiar with the database schema and underlying data to improve the efficiency of information retrieval. It proposes an intelligent query processing based on Attribute-Oriented Generalization (AOG) to better user experience during the information retrieval process. Firstly, it adds semantics discarded by the relational model to raw data through attribute-oriented generalization, and builds instances of Type Abstraction Hierarchy (TAH). Secondly, it constructs the knowledge base, which is designed on a relational mode, so that both the knowledge base and the underlying relational database can be handled in a single formalism by a relational query language. Thirdly, with the derived specific knowledge base incorporated into the underlying database, a prototype intelligent, interactive and straightforward query processing system with B/S architecture has been built at the top of SQL, which returns the semantically neighboring values and higher level more abstract values through the instance of TAH according to the results of the interaction with the user. Finally, three practical query examples are presented to further exemplify the main ideas and demonstrate the usefulness of the proposed query answering processes.

Davis voted
Mike voted
Final decision
What was the agreed final decision?

#715 - Xu 2024
A representation learning-based approach to enhancing manufacturing quality for low-voltage electrical products

Xu, Y. M.; Peng, T.; Tao, J. Q.; Bai, A.; Zhang, N. Y.; Lim, K.

Adv. Eng. Inform. 2024;62():15

2024

DOI: 10.1016/j.aei.2024.102636 · Ref ID: 3485

In low -voltage electrical product manufacturing, resolving quality issues is heavily reliant on engineering experience, and can be time-consuming and error -prone. Through quality management systems, a large number of historical defect cases can be consolidated for analysis along with relevant causes. However, these defect descriptions are often casually described with a mix of Chinese and English language, containing domain -specific terms. Additionally, defect product features have varieties and complex relationships. Therefore, historical defect cases have not been effectively utilized to support manufacturing quality issues. To address this challenge, this study proposes a representation learning -based approach to enhance manufacturing quality. Key research contributions include: (1) A two -stage word embedding technique based on the pre -trained language model. First, TSDAE is utilized for unsupervised pre -training on a large amount of unlabeled data. Then, Sentence -BERT is utilized for fine-tuning on a small set of labeled similar sentence pairs. This process yields a pre -trained language model specific to low -voltage electrical product defects. (2) NSHPSAGE graph embedding model based on the constructed product feature knowledge graph. We select more valuable neighboring nodes during sampling and explore different aggregation functions to enhance graph embedding performance. This model effectively aggregates product feature information into "Defect_Case" nodes, yielding graph embedding vectors. The model exhibits good Weighted -Precision and Weighted -Recall with a short training duration, and it can handle new nodes, addressing the issue of heterogeneous graph embedding. (3) A defect case recommendation technique that fuses word embedding and graph embedding. We use Multi -Head Attention Fusion in the late -fusion to obtain defect case vectors. This approach comprehensively considers defect description semantic knowledge and complex product feature relationships, enabling accurate defect case recommendation with the prototype system.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#729 - Xu 2024
Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering

Xu, Z. T.; Cruz, M. J.; Guevara, M.; Wang, T.; Deshpande, M.; Wang, X. F.; Li, Z.; Assoc Computing, Machinery

47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():2905-2909

Washington, DC Assoc Computing Machinery 2024

DOI: 10.1145/3626772.3661370 · Ref ID: 3251

In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrievalaugmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, which limits performance. We introduce a novel customer service question-answering method that amalgamates RAG with a knowledge graph (KG). Our method constructs a KG from historical issues for use in retrieval, retaining the intra-issue structure and interissue relations. During the question-answering phase, our method parses consumer queries and retrieves related sub-graphs from the KG to generate answers. This integration of a KG not only improves retrieval accuracy by preserving customer service structure information but also enhances answering quality by mitigating the effects of text segmentation. Empirical assessments on our benchmark datasets, utilizing key retrieval (MRR, Recall@K, NDCG@K) and text generation (BLEU, ROUGE, METEOR) metrics, reveal that our method outperforms the baseline by 77.6% in MRR and by 0.32 in BLEU. Our method has been deployed within LinkedIn's customer service team for approximately six months and has reduced the median per-issue resolution time by 28.6%.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#257 - Xu 2024
Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs

Xu, Z. W.; Ichise, R.

26th International Conference on Data Warehousing and Knowledge Discovery (DaWaK) 2024;14912():129-146

Naples, ITALY Springer International Publishing Ag 2024

DOI: 10.1007/978-3-031-68323-7_11 · Ref ID: 3536

During real-world reasoning, the logic path is generally not explicitly articulated. An appropriate causal chain can offer abundant informative details to depict a logical pathway, which is also beneficial in preventing ambiguity problems during text generation. However, most causal chains tend to lose their causal meaning after multiple hops, also this phenomenon occurs in other chains of relations. To discriminate the broken linkage in chain detection task, we introduce the CK-CEVAE model, Chained domain Knowledge in Cause Effect Variational AutoEncoder, which integrates knowledge into the representation of causal assumptions within chains, employing sequential probabilistic distributions for cause-effect estimation. Our model demonstrates an improvement of around 4% in F1-score over LLM-based and neural-based models in identifying causal chains originating from text. Furthermore, to investigate the semantic continuity of chains within established knowledge graphs, we curate a chain-structured dataset, highlighting both causal relations and multiple non-causal relations, i.e. used for, synonym and similar to, termed ConceptNet-CC dataset. We noticed that the longer the chains, the fewer instances of existence. However, contrary to our intuitions, models perform better at identifying longer chains than shorter ones in uni-directional relations like causes and used for.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3956 - Xue 2024
Unlock the Power of Frozen LLMs in Knowledge Graph Completion

Xue, Bo; Xu, Yi; Song, Yunchong; Pang, Yiming; Ren, Yuyang; Ding, Jiaxin; Fu, Luoyi; Wang, Xinbing

arXiv 2024;():

2024

Ref ID: 8527

Traditional knowledge graph completion (KGC) methods rely solely on structural information, struggling with the inherent sparsity of knowledge graphs (KGs). Large Language Models (LLMs) learn extensive knowledge from large corpora with powerful context modeling, making them promising for mitigating the limitations of previous methods. Directly fine-tuning LLMs offers great capability but comes at the cost of huge time and memory consumption, while utilizing frozen LLMs yields suboptimal results.In this work, we aim to leverage LLMs for KGC effectively and efficiently. We capture the context-aware hidden states of knowledge triples by employing prompts to stimulate the intermediate layers of LLMs. We then train a data-efficient classifier on these hidden states to harness the inherent capabilities of frozen LLMs in KGC. Additionally, to reduce ambiguity and enrich knowledge representation, we generate detailed entity descriptions through subgraph sampling on KGs. Extensive experiments on standard benchmarks demonstrate the efficiency and effectiveness of our approach. We outperform traditional KGC methods across most datasets and, notably, achieve classification performance comparable to fine-tuned LLMs while enhancing GPU memory efficiency by 188× and accelerating training and inference by 13.48×.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#2254 - Xueqin 2013
Complex event recognition with uncertainty reasoning

Xueqin, Liu; Clawson, K.; Wang, H.; Scotney, B.; Liu, J.

2013 International Conference on Machine Learning and Cybernetics 2013;04():1823-1828

2013

DOI: 10.1109/ICMLC.2013.6890893 · Ref ID: 6271

The goal of complex event recognition considered in this paper is the automatic detection of complex high-level events in videos. This is a difficult task, especially when videos are captured under unconstrained conditions, with poor lighting, heavy background clutter and occlusion. In this paper, we propose a hierarchical knowledge-based framework for complex event recognition. The video event knowledge represents an arbitrary complex spatio-temporal event as a hierarchical composition of simpler events in a natural way. Uncertainty reasoning procedures are applied to interpret low level event descriptions according to the video knowledge base in order to recognize high level scenarios.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2014 - Yadav 2023
Unleashing the Power of Large Language Model, Textual Embeddings, and Knowledge Graphs for Advanced Information Retrieval

Yadav, D.; Para, H.; Selvakumar, P.

International Conference on Electrical, Computer and Energy Technologies, ICECET 2023 2023;():

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICECET58911.2023.10389253 · Ref ID: 4984

Acquiring knowledge beyond the usual expertise is a critical challenge when implementing semantic information solutions for querying a knowledge base. To address this difficulty, one proposed solution was to use knowledge graphs in conjunction with traditional Question & Answering (Q&A) systems. However, this approach struggles with limited facts, difficulty in obtaining further insights into the context, and limited ability to handle complex questions, leading to inaccurate or irrelevant answers. To overcome these limitations, we present an approach for answering inference-based questions that integrates knowledge graphs, a large language model, and relevant embeddings from a vector database. Combining knowledge graphs and word embeddings significantly enhances the strength of both techniques, leading to improved performance of Question and Answering systems. We begin with generating representations of the relevant nodes in the knowledge graph and retrieve the most appropriate information from a collection of stored textual data using word embeddings. This approach tackles the shortcomings of conventional approaches that rely solely on knowledge graphs and are too rigid to handle the nuances of the context. This method provides a sophisticated understanding of language and context, enabling it to handle complex questions that may involve multiple entities and relationships with a better understanding of the facts and context in which the question is being asked. The system's ability to handle complex queries is evidenced through a combination of theoretical analysis and empirical data. Our approach has demonstrated exceptional efficiency on a benchmark dataset, as evidenced by evaluating the F1 score. © 2023 IEEE.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#2718 - Yan 2024
Multi-view Few-shot Reasoning for Emerging Entities in Knowledge Graphs

Yan, C.; Zhao, F.; Tao, X.; Zhu, X.

IEEE Transactions on Big Data 2024;():1-13

2024

DOI: 10.1109/TBDATA.2024.3453749 · Ref ID: 6082

A knowledge graph (KG) is a form of representing knowledge of the objective world. With the expansion of knowledge, KGs frequently incorporate new entities, which often possess limited associated data, known as few-shot features. Addressing the missing knowledge for these emerging entities is crucial practically, but there are significant challenges due to data scarcity. Previously developed methods based on knowledge graph embedding (KGE) and graph neural networks (GNNs) focusing on instance-level KGs are confronted with challenges of data scarcity and model simplicity, rendering them inapplicable to reasoning tasks in few-shot scenarios. To tackle these issues, we propose a multi-view few-shot KG reasoning method for emerging entities. The primary focus of our method lies in resolving the problem of link prediction for emerging entities with limited associated triples from multiple perspectives. Distinct from previous methods, our approach initially abstracts a concept-view KG from the conventional instance-view KG, enabling the formulation of commonsense rules. Additionally, we employ the aggregation of multi-hop subgraph features to enhance the representation of emerging entities. Furthermore, we introduce a more efficient cross-domain negative sampling strategy and a multi-view triple scoring function based on commonsense rules. Our experimental evaluations highlight the effectiveness of our method in few-shot contexts, demonstrating its robustness and adaptability in both cross-shot and zero-shot scenarios, significantly outperforming existing models in these challenging settings.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3561 - Yan 2021
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining

Yan, Ruiqing; Sun, Lanchang; Wang, Fang; Zhang, Xiaoming

arXiv 2021;():

2021

Ref ID: 7458

Though pre-trained language models such as Bert and XLNet, have rapidly advanced the state-of-the-art on many NLP tasks, they implicit semantics only relying on surface information between words in corpus. Intuitively, background knowledge influences the efficacy of understanding. Inspired by this common sense, we focus on improving model pretraining by leveraging explicit knowledge. Different from recent research that optimize pretraining model by knowledge masking strategies, we propose a simple but general method to combine explicit knowledge with pretraining. To be specific, we first match knowledge facts from knowledge graph (KG) and then add a knowledge injunction layer to transformer directly without changing its architecture. The present study seeks to find the direct impact of explicit knowledge on transformer per-training. We conduct experiments on various datasets for different downstream tasks. The experimental results show that solely by adding external knowledge to transformer can improve the learning performance on many NLP tasks.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1542 - Yan 2024
KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration

Yan, Y.; Hou, Y.; Xiao, Y.; Zhang, R.; Wang, Q.

IEEE Trans Visual Comput Graphics 2024;():

2024

DOI: 10.1109/TVCG.2024.3456364 · Ref ID: 4379

The increasing reliance on Large Language Models (LLMs) for health information seeking can pose severe risks due to the potential for misinformation and the complexity of these topics. This paper introduces KNOWNET a visualization system that integrates LLMs with Knowledge Graphs (KG) to provide enhanced accuracy and structured exploration. Specifically, for enhanced accuracy, KNOWNET extracts triples (e.g., entities and their relations) from LLM outputs and maps them into the validated information and supported evidence in external KGs. For structured exploration, KNOWNET provides next-step recommendations based on the neighborhood of the currently explored entities in KGs, aiming to guide a comprehensive understanding without overlooking critical aspects. To enable reasoning with both the structured data in KGs and the unstructured outputs from LLMs, KNOWNET conceptualizes the understanding of a subject as the gradual construction of graph visualization. A progressive graph visualization is introduced to monitor past inquiries, and bridge the current query with the exploration history and next-step recommendations. We demonstrate the effectiveness of our system via use cases and expert interviews. © 2024 IEEE.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#3196 - Yan 2024
Atomic Fact Decomposition Helps Attributed Question Answering

Yan, Zhichao; Wang, Jiapu; Chen, Jiaoyan; Li, Xiaoli; Li, Ru; Pan, Jeff Z.

arXiv 2024;():

2024

Ref ID: 8741

Attributed Question Answering (AQA) aims to provide both a trustworthy answer and a reliable attribution report for a given question. Retrieval is a widely adopted approach, including two general paradigms: Retrieval-Then-Read (RTR) and post-hoc retrieval. Recently, Large Language Models (LLMs) have shown remarkable proficiency, prompting growing interest in AQA among researchers. However, RTR-based AQA often suffers from irrelevant knowledge and rapidly changing information, even when LLMs are adopted, while post-hoc retrieval-based AQA struggles with comprehending long-form answers with complex logic, and precisely identifying the content needing revision and preserving the original intent. To tackle these problems, this paper proposes an Atomic fact decomposition-based Retrieval and Editing (ARE) framework, which decomposes the generated long-form answers into molecular clauses and atomic facts by the instruction-tuned LLMs. Notably, the instruction-tuned LLMs are fine-tuned using a well-constructed dataset, generated from large scale Knowledge Graphs (KGs). This process involves extracting one-hop neighbors from a given set of entities and transforming the result into coherent long-form text. Subsequently, ARE leverages a search engine to retrieve evidences related to atomic facts, inputting these evidences into an LLM-based verifier to determine whether the facts require expansion for re-retrieval or editing. Furthermore, the edited facts are backtracked into the original answer, with evidence aggregated based on the relationship between molecular clauses and atomic facts. Extensive evaluations demonstrate the superior performance of our proposed method over the state-of-the-arts on several datasets, with an additionally proposed new metric Attr_(p) for evaluating the precision of evidence attribution.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3517 - Yang 2021
Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta Information

Yang, Bowen; Han, Cong; Li, Yu; Zuo, Lei; Yu, Zhou

arXiv 2021;():

2021

Ref ID: 7507

Conversational recommendation systems (CRS) engage with users by inferring user preferences from dialog history, providing accurate recommendations, and generating appropriate responses. Previous CRSs use knowledge graph (KG) based recommendation modules and integrate KG with language models for response generation. Although KG-based approaches prove effective, two issues remain to be solved. First, KG-based approaches ignore the information in the conversational context but only rely on entity relations and bag of words to recommend items. Second, it requires substantial engineering efforts to maintain KGs that model domain-specific relations, thus leading to less flexibility. In this paper, we propose a simple yet effective architecture comprising a pre-trained language model (PLM) and an item metadata encoder. The encoder learns to map item metadata to embeddings that can reflect the semantic information in the dialog context. The PLM then consumes the semantic-aligned item embeddings together with dialog context to generate high-quality recommendations and responses. Instead of modeling entity relations with KGs, our model reduces engineering complexity by directly converting each item to an embedding. Experimental results on the benchmark dataset ReDial show that our model obtains state-of-the-art results on both recommendation and response generation tasks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1231 - Yang 2024
EHAPZero: Ensemble Hierarchical Attribute Prompting Based Zero-Shot Learning for Pest Recognition

Yang, C.; Jin, Q.; Wang, Y.; Zhou, Y.; Lan, D.; Yang, Y.

IEEE Internet Things J. 2024;():

2024

DOI: 10.1109/JIOT.2024.3472079 · Ref ID: 4132

Pest recognition is of great significance for achieving sustainable development in agriculture. Nevertheless, due to the wide variety of pest species, subtle inter-species differences, and significant intra-species variations, existing artificial intelligence and Internet of Things (IoT) technologies can only recognize a small number of known pests effectively. In this paper, we propose a zero-shot learning pest recognition framework based on ensemble hierarchical attribute prompting, termed EHAPZero. EHAPZero can identify pest images collected by IoT devices, and then transmit the recognition results to the IoT platform for terminal display. Specifically, the image recognition function is implemented by an attribute generation module (AGM), a hierarchical prompting module (HPM), and a semantic-visual interaction module (SVIM). AGM utilizes large language models to construct a knowledge graph of pests. It employs both node importance evaluation algorithms and manual methods to perform dual filtering on attribute nodes within the graph. Inspired by human knowledge reasoning, HPM dynamically predicts different hierarchical attributes of input images within the Transformer intermediate blocks. These predicted attributes are subsequently injected into the intermediate layer features of the Transformer as prompts. To achieve semantic disambiguation and knowledge transfer, SVIM employs a visual-guided semantic representation method and a semantic-guided visual representation method to strengthen cross-domain interaction between semantics and vision. Finally, the final prediction score is derived through ensemble of prediction results across different levels. Extensive experiments show that EHAPZero achieves the new state-of-theart results on the real-word pest recognition benchmark. The codes are available at: https://github.com/jinqiwen/EHAPZero. © 2014 IEEE.

Ishan voted
Kwesi voted
Final decision
What was the agreed final decision?

#3845 - Yang 2023
A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises

Yang, Carl; Cui, Hejie; Lu, Jiaying; Wang, Shiyu; Xu, Ran; Ma, Wenjing; Yu, Yue; Yu, Shaojun; Kan, Xuan; Ling, Chen; Fu, Tianfan; Zhao, Liang; Ho, Joyce; Wang, Fei

arXiv 2023;():

2023

Ref ID: 7753

Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs such as regarding data heterogeneity and limited coverage are not fully understood, highlighting the need for detailed reviews. This work provides the first comprehensive review of HKGs. It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches, i.e., model-free and model-based. The existing HKG resources are also organized based on the data types they capture and application domains they cover, along with relevant statistical information (Resource available at https://github.com/lujiaying/Awesome-HealthCare-KnowledgeBase). At the application level, we delve into the successful integration of HKGs across various health domains, ranging from fine-grained basic science research to high-level clinical decision support and public health. Lastly, the paper highlights the opportunities for HKGs in the era of LLMs. This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1572 - Yang 2024
Learning Choice Nuance for Multiple-Choice Commonsense Question Answering

Yang, D.; Deng, W.; Wang, Z.; Wang, K.; Zhuang, Z.; Li, H.

Proceedings of the International Joint Conference on Neural Networks 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/IJCNN60899.2024.10651121 · Ref ID: 4286

Existing models for commonsense question answering (CQA) usually focus on combining pre-trained language models (PLMs) and structured knowledge graphs (KGs) for joint reasoning. However, such approaches encode a QA context (i.e., a pair of the question and a choice) separately from other choices, ineffective for explicitly capturing useful subtle differences among the choices, which results in incorrect answers in some cases. This paper proposes a novel model LNC (Learning Nuance among Choices) for addressing this problem and thus provides an improved approach to multiple-choice question answering. Specifically, LNC explicitly interacts between the text knowledge corresponding to each choice and the external KG knowledge corresponding to each choice, and removes the commonalities among similar choices, allowing the model to focus on different relevant knowledge based on the choices, thereby distinguishing semantically similar choices. Experimental results on major benchmark datasets show that LNC is competitive comparing to the baseline models. © 2024 IEEE.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1292 - Yang 2023
Expanding the Vocabulary of BERT for Knowledge Base Construction

Yang, D.; Wang, X.; Celebi, R.

CEUR Workshop Proceedings 2023;3577():

CEUR-WS 2023

Ref ID: 5066

Knowledge base construction entails acquiring structured information to create a knowledge base of factual and relational data, facilitating question answering, information retrieval, and semantic understanding. The challenge called”Knowledge Base Construction from Pretrained Language Models” at International Semantic Web Conference 2023 defines tasks focused on constructing knowledge base using language model. Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion, and the inclusion of entity descriptions within the prompt is prohibited. Although the masked language model offers sufficient flexibility to extend its vocabulary, it is not inherently designed for multi-token prediction. To address this, we present Vocabulary Expandable BERT for knowledge base construction, which expand the language model’s vocabulary while preserving semantic embeddings for newly added words. We adopt task-specific re-pre-training on masked language model to further enhance the language model. Through experimentation, the results show the effectiveness of our approaches. Our framework achieves F1 score of 0.323 on the hidden test set and 0.362 on the validation set, both data set is provided by the challenge. Notably, our framework adopts a lightweight language model (BERT-base, 0.13 billion parameters) and surpasses the model using prompts directly on large language model (Chatgpt-3, 175 billion parameters). Besides, Token-Recode achieves comparable performances as Re-pretrain. This research advances language understanding models by enabling the direct embedding of multi-token entities, signifying a substantial step forward in link prediction task in knowledge graph and metadata completion in data management. 1 © 2023 CEUR-WS. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1907 - Yang 2024
Structure Prompt Augmented Language Model Embedding on Electrical Equipment Defect Knowledge Graph

Yang, H.; Meng, X.; Yu, H.; Bai, Y.; Han, Y.; Liu, Y.

Int. J. High Speed Electron. Syst. 2024;():

2024

DOI: 10.1142/S0129156424400512 · Ref ID: 4454

Knowledge graphs have demonstrated significant impact in the power grid domain, facilitating various applications such as defect diagnosis and grid management. However, their reasoning capabilities have not been fully exploited. In this paper, we explore the utilization of knowledge graphs for power grid defect diagnosis. We construct an electrical equipment defect knowledge graph and predict missing links, which is also known as Knowledge Graph Completion (KGC). However, we notice the long-tail problem in electrical equipment knowledge graph. To tackle this challenge, we propose a novel text-based model named SPALME (Structure Prompt Augmented Language Model Embedding) that incorporates structural information as prompts. Our model leverages the power of pre-trained language models, allowing it to comprehend the semantic information of entities and relationships in the knowledge graph. Additionally, by integrating structural information as prompts during the learning process, our model gains a deeper understanding of the graph's topological structure efficiently, effectively capturing intricate dependencies between grid equipments. We evaluate our approach on various datasets. The results demonstrate that our model consistently outperforms baseline methods on the majority of the datasets. © 2024 World Scientific Publishing Company.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#611 - Yang 2020
NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model

Yang, H.; Qin, Y.; Deng, Y.; Wang, M. H.; Ieee

22nd IEEE International Conference on Advanced Communication Technology (ICACT) 2020;():185-189

Pyeongchang, SOUTH KOREA Ieee 2020

DOI: 10.23919/icact48636.2020.9061292 · Ref ID: 3065

Pre-trained language models like Bert, RoBERT a, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#314 - Yang 2024
Give us the Facts: Enhancing Large Language Models With Knowledge Graphs for Fact-Aware Language Modeling

Yang, L. Y.; Chen, H. Y.; Li, Z.; Ding, X.; Wu, X. D.

IEEE Trans. Knowl. Data Eng. 2024;36(7):3091-3110

2024

DOI: 10.1109/tkde.2024.3360454 · Ref ID: 2944

Recently, ChatGPT, a representative large language model (LLM), has gained considerable attention. Due to their powerful emergent abilities, recent LLMs are considered as a possible alternative to structured knowledge bases like knowledge graphs (KGs). However, while LLMs are proficient at learning probabilistic language patterns and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance in generating texts requiring factual knowledge and providing more informed responses to user queries. This paper reviews the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, this paper proposes enhancing LLMs with KGs by developing knowledge graph-enhanced large language models (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#3947 - Yang 2024
Two Heads Are Better Than One: Integrating Knowledge from Knowledge Graphs and Large Language Models for Entity Alignment

Yang, Linyao; Chen, Hongyang; Wang, Xiao; Yang, Jing; Wang, Fei-Yue; Liu, Han

arXiv 2024;():

2024

Ref ID: 8055

Entity alignment, which is a prerequisite for creating a more comprehensive Knowledge Graph (KG), involves pinpointing equivalent entities across disparate KGs. Contemporary methods for entity alignment have predominantly utilized knowledge embedding models to procure entity embeddings that encapsulate various similarities-structural, relational, and attributive. These embeddings are then integrated through attention-based information fusion mechanisms. Despite this progress, effectively harnessing multifaceted information remains challenging due to inherent heterogeneity. Moreover, while Large Language Models (LLMs) have exhibited exceptional performance across diverse downstream tasks by implicitly capturing entity semantics, this implicit knowledge has yet to be exploited for entity alignment. In this study, we propose a Large Language Model-enhanced Entity Alignment framework (LLMEA), integrating structural knowledge from KGs with semantic knowledge from LLMs to enhance entity alignment. Specifically, LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across KGs and edit distances to a virtual equivalent entity. It then engages an LLM iteratively, posing multiple multi-choice questions to draw upon the LLM's inference capability. The final prediction of the equivalent entity is derived from the LLM's output. Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models. Additional ablation studies underscore the efficacy of our proposed framework.

Srividya voted
Kwesi voted
Final decision
What was the agreed final decision?

#1721 - Yang 2024
PEK: A Parameter-Efficient Framework for Knowledge-Grounded Dialogue Generation

Yang, P.; Song, D.; Wu, Z.; Zhou, Y.; Yang, Z.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():9261-9273

Association for Computational Linguistics (ACL) 2024

Ref ID: 4228

Pre-trained language models (PLMs) have shown great dialogue generation capability in different scenarios. However, the huge VRAM consumption when fine-tuning them is one of their drawbacks. PEFT approaches can significantly reduce the number of trainable parameters, which enables us to fine-tune larger dialogue generation models. However, the reduction in parameter quantity can diminish a PLM's expressive capacity and affect the PLM's learning from certain specific examples like knowledge-related conversations. Previous works have demonstrated that injecting external knowledge into dialogue generation models can improve the model's performance in knowledge-related conversations. Nonetheless, these methods are designed for the scenario where most parameters of the entire framework are trainable. In this paper, we propose PEK, a parameter-efficient framework for knowledge-enhanced dialogue generation. It enables PLMs to leverage external knowledge documents and knowledge graphs to enhance its generation capabilities with an acceptable number of trainable parameters. Evaluation results on the Wizard of Wikipedia and CMU_DoG datasets show that our approach outperforms baseline methods on multiple evaluation metrics, which validates the effectiveness of our approach. © 2024 Association for Computational Linguistics.

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#28 - Yang 2024
Ascle-A Python Natural Language Processing Toolkit for MedicalText Generation:Development and Evaluation Study

Yang, R.; Zeng, Q. C.; You, K.; Qiao, Y. J.; Huang, L. C.; Hsieh, C. C.; Rosand, B.; Goldwasser, J.; Dave, A.; Keenan, T.; Ke, Y. H.; Hong, C.; Liu, N.; Chew, E.; Radev, D.; Lu, Z. Y.; Xu, H.; Chen, Q. Y.; Li, I. R. E.

J. Med. Internet Res. 2024;26():14

2024

DOI: 10.2196/60601 · Ref ID: 3525

Background: Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consumingand labor-intensive process. To address this, natural language processing (NLP) algorithms have been developed to automatetext processing. In the biomedical field, various toolkits for text processing exist, which have greatly improved the efficiency ofhandling unstructured text. However, these existing toolkits tend to emphasize different perspectives, and none of them offergeneration capabilities, leaving a significant gap in the current offerings.Objective: This study aims to describe the development and preliminary evaluation of Ascle. Ascle is tailored for biomedicalresearchers and clinical staff with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the firsttime, Ascle provides 4 advanced and challenging generative functions: question-answering, text summarization, text simplification,and machine translation. In addition, Ascle integrates 12 essential NLP functions, along with query and search capabilities forclinical databases.Methods: We fine-tuned 32 domain-specific language models and evaluated them thoroughly on 27 established benchmarks.In addition, for the question-answering task, we developed a retrieval-augmented generation (RAG) framework for large language models that incorporated a medical knowledge graph with ranking techniques to enhance the reliability of generated answers.Additionally, we conducted a physician validation to assess the quality of generated content beyond automated metrics.Results: The fine-tuned models and RAG framework consistently enhanced text generation tasks. For example, the fine-tunedmodels improved the machine translation task by 20.27 in terms of BLEU score. In the question-answering task, the RAGframework raised the ROUGE-L score by 18% over the vanilla models. Physician validation of generated answers showed highscores for readability (4.95/5) and relevancy (4.43/5), with a lower score for accuracy (3.90/5) and completeness (3.31/5).Conclusions: This study introduces the development and evaluation of Ascle, a user-friendly NLP toolkit designed for medicaltext generation. All code is publicly available through the Ascle GitHub repository. All fine-tuned language models can beaccessed through Hugging Face

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3487 - Yang 2024
Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education

Yang, Rui; Yang, Boming; Ouyang, Sixun; She, Tianwei; Feng, Aosong; Jiang, Yuang; Lecue, Freddy; Lu, Jinghui; Li, Irene

arXiv 2024;():

2024

Ref ID: 8461

Knowledge graphs (KGs) are crucial in the field of artificial intelligence and are widely applied in downstream tasks, such as enhancing Question Answering (QA) systems. The construction of KGs typically requires significant effort from domain experts. Recently, Large Language Models (LLMs) have been used for knowledge graph construction (KGC), however, most existing approaches focus on a local perspective, extracting knowledge triplets from individual sentences or documents. In this work, we introduce Graphusion, a zero-shot KGC framework from free text. The core fusion module provides a global view of triplets, incorporating entity merging, conflict resolution, and novel triplet discovery. We showcase how Graphusion could be applied to the natural language processing (NLP) domain and validate it in the educational scenario. Specifically, we introduce TutorQA, a new expert-verified benchmark for graph reasoning and QA, comprising six tasks and a total of 1,200 QA pairs. Our evaluation demonstrates that Graphusion surpasses supervised baselines by up to 10% in accuracy on link prediction. Additionally, it achieves average scores of 2.92 and 2.37 out of 3 in human evaluations for concept entity extraction and relation recognition, respectively.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#766 - Yang 2024
Sequential Recommendation with Latent Relations based on Large Language Model

Yang, S. H.; Ma, W. Z.; Sun, P. J.; Ai, Q. Y.; Liu, Y. Q.; Cai, M. C.; Zhang, M.; Assoc Computing, Machinery

47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():335-344

Washington, DC Assoc Computing Machinery 2024

DOI: 10.1145/3626772.3657762 · Ref ID: 3375

Sequential recommender systems predict items that may interest users by modeling their preferences based on historical interactions. Traditional sequential recommendation methods rely on capturing implicit collaborative filtering signals among items. Recent relation-aware sequential recommendation models have achieved promising performance by explicitly incorporating item relations into the modeling of user historical sequences, where most relations are extracted from knowledge graphs. However, existing methods rely on manually predefined relations and suffer the sparsity issue, limiting the generalization ability in diverse scenarios with varied item relations. In this paper, we propose a novel relation-aware sequential recommendation framework with Latent Relation Discovery (LRD). Different from previous relation-aware models that rely on predefined rules, we propose to leverage the Large Language Model (LLM) to provide new types of relations and connections between items. The motivation is that LLM contains abundant world knowledge, which can be adopted to mine latent relations of items for recommendation. Specifically, inspired by that humans can describe relations between items using natural language, LRD harnesses the LLM that has demonstrated human-like knowledge to obtain language knowledge representations of items. These representations are fed into a latent relation discovery module based on the discrete state variational autoencoder (DVAE). Then the self-supervised relation discovery tasks and recommendation tasks are jointly optimized. Experimental results on multiple public datasets demonstrate our proposed latent relation discovery method can be incorporated with existing relation-aware sequential recommendation models and significantly improve the performance. Further analysis experiments indicate the effectiveness and reliability of the discovered latent relations.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#22 - Yang 2022
Approximate inferring with confidence predicting based on uncertain knowledge graph embedding

Yang, S. H.; Zhang, W. Y.; Tang, R.; Zhang, M. K.; Huang, Z. S.

Inf. Sci. 2022;609():679-690

2022

DOI: 10.1016/j.ins.2022.07.098 · Ref ID: 3031

Uncertainty is a natural character of knowledge, while it is still tough to be encoded into the knowledge graph embedding space that can be employed for machine learning tasks. However, the approximate inference could be performed in the embedding space, if confi-dence, real-value representation of the uncertainty of knowledge facts, can be learned by neural networks. To tackle this, a simple yet effective confidence predicting method is pro-posed, and several approximate inferring are efficiently performed based on these predic-tions. The model is a two-step model: knowledge elements embedding step, in which knowledge facts regarded as short sentences are fed into the natural language model to get entity and relation embedding vectors; and confidence learning step, in which the con-fidence distribution of knowledge facts in the knowledge graph are learned utilizing the recurrent neural network in order to carry out approximate inference. The experience demonstrates that the model achieves better results than state-of-the-art on the link pre-diction task over uncertain knowledge graph embedding. Uncertainty inferring grounded on predicted confidence is more accurate, feasible, and meaningful for serval knowledge inferring tasks: transitivity, composition inferring, and probabilistic soft logic inferring. Likewise, the proposed approach achieves the best tradeoff between efficiency and accu-racy of uncertain knowledge graph embedding and inferring, and can be used to handle large size knowledge graphs at lower time consumption because of the simplicity.(c) 2022 Elsevier Inc. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1723 - Yang 2020
Person-relation extraction using bert based knowledge graph

Yang, S. M.; Yoo, S. Y.; Ahn, Y. S.; Jeong, O. R.

ICIC Express Lett Part B Appl. 2020;11(6):539-544

2020

DOI: 10.24507/icicelb.11.06.539 · Ref ID: 5797

Artificial intelligence technology has been actively researched in the areas of image processing and natural language processing. Recently, with the release of Google’s language model BERT, the importance of artificial intelligence models has attracted attention in the field of natural language processing. In this paper, we propose a knowledge graph to build a model that can extract people in a document using BERT, and to grasp the relationship between people based on the model. In addition, to verify the applicability of person extraction techniques using BERT based knowledge graphs, we conduct a performance comparison experiment with other person extraction models and apply our proposed method to the case study. © 2020, ICIC International.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1577 - Yang 2021
Learning Knowledge Uncertainty from the Pretrained Language Model

Yang, S.; Tang, R.

ACM International Conference Proceeding Series 2021;():37-42

Association for Computing Machinery 2021

DOI: 10.1145/3503928.3503936 · Ref ID: 5588

Uncertain knowledge graphs, with each fact assigned a confdence value between 0 and 1, is a kind of graph structured knowledge bases. Knowledge representation is the foundation for most of knowledge-driven applications, in which knowledge are encoded into a continuous vector space for rapidly computing. Unfortunately, it is still a big challenge to encode meaningful knowledge features into the embedding space, such as uncertainty, inferring structures, and commonsense knowledge. An uncertain knowledge graphs embedding model, UKGEbert, is proposed to embed latent commonsense semantics by the pretrained natural language model. In the model, each knowledge fact is treated as a short sentence, which is fed into BERT for training. After that, the model learns uncertainty distribution of knowledge confdence by the recurrent neural network. Experiments on several benchmark datasets show that an e?ective prediction of confdence can help enhancing ability of knowledge inferring in the embedding space. Furthermore, the model achieves state of the art in several main metrics on the link prediction task of uncertain knowledge graphs. © 2021 Association for Computing Machinery. All rights reserved.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#2653 - Yang 2010
Mapping Relational Databases into Ontologies through a Graph-based Formal Model

Yang, S.; Wu, J.

2010 Sixth International Conference on Semantics, Knowledge and Grids 2010;():219-226

2010

DOI: 10.1109/SKG.2010.33 · Ref ID: 6432

One of key issues of the Semantic Web applications is the lack of semantic data (ontologies). Although the vast majority of data are stored in the popular relational databases, they are still not easily available for many next generation Web applications. Therefore, one of core challenges of Semantic Web is whether these applications can automatically retrieve semantic information from the existed relational databases. This paper proposes a middle graph-based formal model language, W-graph, a bridge between relational databases and ontologies, which abstracts semantic information from relational database instances semi-automatically and then generates an OWL ontology automatically. This method not only maps relational database schemata to ontologies, but also populates ontologies with data stored in databases. Moreover, a proof of semantic preserving on the mapping is provided, and a case study and an implemented prototype tool are also reported.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#1149 - Yang 2020
Creative storytelling with language models and knowledge graphs

Yang, X.; Tiddi, I.

CEUR Workshop Proceedings 2020;2699():

CEUR-WS 2020

Ref ID: 5746

Automated story generation is a popular and well-recognized task in the field of natural language processing. The emergence of pre-trained language models based on large Transformer architectures shows the great capability of text generation. However, language models are limited when the generation requires explicit clues within the context. In this research, we study how to combine knowledge graphs with language models, and build a creative story generation system named DICE. DICE uses external knowledge graphs to provide context clues and implicit knowledge to generate coherent and creative stories. The evaluation shows that our approach can effectively inject the knowledge from knowledge graphs into the stories automatically generated by the language model. © 2020 CEUR-WS. All rights reserved.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3304 - Yang 2024
CRAG – Comprehensive RAG Benchmark

Yang, Xiao; Sun, Kai; Xin, Hao; Sun, Yushi; Bhalla, Nikita; Chen, Xiangsen; Choudhary, Sajal; Gui, Rongze Daniel; Jiang, Ziran Will; Jiang, Ziyu; Kong, Lingkun; Moran, Brian; Wang, Jiaqi; Xu, Yifan Ethan; Yan, An; Yang, Chenyu; Yuan, Eting; Zha, Hanwen; Tang, Nan; Chen, Lei; Scheffer, Nicolas; Liu, Yue; Shah, Nirav; Wanga, Rakesh; Kumar, Anuj; Yih, Wen-tau; Dong, Xin Luna

arXiv 2024;():

2024

Ref ID: 8363

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA. Whereas most advanced LLMs achieve &lt;=34% accuracy on CRAG, adding RAG in a straightforward manner improves the accuracy only to 44%. State-of-the-art industry RAG solutions only answer 63% questions without any hallucination. CRAG also reveals much lower accuracy in answering questions regarding facts with higher dynamism, lower popularity, or higher complexity, suggesting future research directions. The CRAG benchmark laid the groundwork for a KDD Cup 2024 challenge, attracting thousands of participants and submissions within the first 50 days of the competition. We commit to maintaining CRAG to serve research communities in advancing RAG solutions and general QA solutions.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#2006 - Yang 2024
UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing

Yang, Y.; He, J.; Chen, P.; Gutiérrez-Basulto, V.; Pan, J. Z.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():7011-7028

Association for Computational Linguistics (ACL) 2024

Ref ID: 4423

Several recent papers have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we focus on the factual probing performance over unseen prompts from tuning, and using a probabilistic view we show the inherent misalignment between pre-training and downstream tuning objectives in language models for probing knowledge. We hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts. We propose an adapter-based framework, UniArk, for generalised and consistent factual knowledge extraction through simple methods without introducing extra parameters. Extensive experiments show that UniArk can significantly improve the model’s out-of-domain generalisation as well as consistency under various prompts. Additionally, we construct ParaTrex, a large-scale and diverse dataset for measuring the inconsistency and out-of-domain generation of models. Further, ParaTrex offers a reference method for constructing paraphrased datasets using large language models. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#16 - Yang 2023
API comparison knowledge extraction via prompt-tuned language model

Yang, Y. R.; Zhu, Y. P.; Chen, S. S.; Jian, P. P.

J. Comput. Lang. 2023;75():8

2023

DOI: 10.1016/j.cola.2023.101200 · Ref ID: 3644

Application Programming Interfaces (APIs) are frequent in software engineering domain texts, such as API references and Stack Overflow. These APIs and the comparison knowledge between them are not only important for solving programming issues (e.g., question answering), but they are also organized into structured knowledge to support many software engineering tasks (e.g., API misuse detection). As a result, extracting API comparison knowledge (API entities and semantic relations) from texts is essential. Existing rule-based and sequence labeling-based approaches must manually enumerate all linguistic patterns or label a large amount of data. Therefore, they involve a significant labor overhead and are exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, we formulates heterogeneous API extraction and API relation extraction tasks as a sequence-to-sequence generation task. It proposes APICKnow, an API entity-relation joint extraction model based on the large language model. To improve our model's performance and quick learning ability, we adopt the prompt learning method to stimulate APICKnow to recognize API entities and relations. We systematically evaluate APICKnow on a set of sentences from Stack Overflow. The experimental results show that APICKnow can outperform the state-of-the-art baselines, and APICKnow has a quick learning ability and strong generalization ability.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1434 - Yang 2024
An Intra-Network Multi-Teacher Distillation Method Towards Lightweight Knowledge Graph Completion

Yang, Z.; Duan, Y.; Xue, J.; Qi, Q.

2024 IEEE 9th International Conference on Computational Intelligence and Applications, ICCIA 2024 2024;():109-114

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/ICCIA62557.2024.10719142 · Ref ID: 4182

Recently, Knowledge Graph Completion (KGC) based on Pre-trained Language Models (PLM) has made significant advancements. However, PLM typically have a large number of parameters, which makes lightweight research for low-resource challenging. For KGC, knowledge distillation can be an portable method. But traditional knowledge distillation is difficult to achieve efficient knowledge transfer. To solve this issue, this paper proposes an intra-network multi-teacher knowledge distillation, which can effectively reduce knowledge leakage through multi-level information transmission. Specifically, we divide the teacher model into multiple sub-teachers based on network depth, the sub-teachers deliver different knowledge representations. In addition, we use the loss variation of each sub-teacher as a confidence level, which can dynamically regulate the intensity of multi-teacher distillation and enable the student model to perceive distilled knowledge at a finer granularity. A series of experimental results show that our proposed method achieves state-of-the-art performance with the low number of parameters. © 2024 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1146 - Yang 2023
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text

Yang, Z.; Ishay, A.; Lee, J.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():5186-5219

Association for Computational Linguistics (ACL) 2023

Ref ID: 5272

While large language models (LLMs), such as GPT-3, appear to be robust and general, their reasoning ability is not at a level to compete with the best models trained for specific natural language reasoning problems. In this study, we observe that a large language model can serve as a highly effective few-shot semantic parser. It can convert natural language sentences into a logical form that serves as input for answer set programs, a logic-based declarative knowledge representation formalism. The combination results in a robust and general system that can handle multiple question-answering tasks without requiring retraining for each new task. It only needs a few examples to guide the LLM's adaptation to a specific task, along with reusable ASP knowledge modules that can be applied to multiple tasks. We demonstrate that this method achieves state-of-the-art performance on several NLP benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN. Additionally, it successfully tackles robot planning tasks that an LLM alone fails to solve. © 2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#2376 - Yang 2023
EMoDi: Entity-Enhanced Momentum-Difference Contrastive Learning for Semantic-Aware Verification of Scientific Information

Yang, Z.; Sun, Y.; Nakaguchi, T.; Imai, M.

2023 IEEE International Conference on Knowledge Graph (ICKG) 2023;():142-151

2023

DOI: 10.1109/ICKG59574.2023.00023 · Ref ID: 6314

This paper proposes the EMoDi system to improve the performance of the entire scientific information verification pipeline. First, the Momentum-Difference contrastive learning framework is introduced to capture more semantics information. In abstract retrieval, entity-enhancement and noise-ignoration are introduced to improve the ability to retrieve relevant abstracts more accurately. In addition, a two-step verification method is used in label prediction to improve the label prediction ability and reduce the false positive rate of the “NOT ENOUGH INFO” label. The proposed pipeline outperforms the baseline VERISCI and QMUL-SDS. The code of this system is available on GitHub.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3310 - Yang 2024
CuriousLLM: Elevating Multi-Document QA with Reasoning-Infused Knowledge Graph Prompting

Yang, Zukang; Zhu, Zixuan

arXiv 2024;():

2024

Ref ID: 8234

In the field of Question Answering (QA), unifying large language models (LLMs) with external databases has shown great success. However, these methods often fall short in providing the advanced reasoning needed for complex QA tasks. To address these issues, we improve over a novel approach called Knowledge Graph Prompting (KGP), which combines knowledge graphs with a LLM-based agent to improve reasoning and search accuracy. Nevertheless, the original KGP framework necessitates costly fine-tuning with large datasets yet still suffers from LLM hallucination. Therefore, we propose a reasoning-infused LLM agent to enhance this framework. This agent mimics human curiosity to ask follow-up questions to more efficiently navigate the search. This simple modification significantly boosts the LLM performance in QA tasks without the high costs and latency associated with the initial KGP framework. Our ultimate goal is to further develop this approach, leading to more accurate, faster, and cost-effective solutions in the QA domain.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3568 - Yao 2019
KG-BERT: BERT for Knowledge Graph Completion

Yao, Liang; Mao, Chengsheng; Luo, Yuan

arXiv 2019;():

2019

Ref ID: 7374

Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#11 - Yao 2024
AgCNER, the First Large-Scale Chinese Named Entity Recognition Dataset for Agricultural Diseases and Pests

Yao, X. C.; Hao, X.; Liu, R. L.; Li, L.; Guo, X. C.

Sci. Data 2024;11(1):14

2024

DOI: 10.1038/s41597-024-03578-5 · Ref ID: 3338

Named entity recognition is a fundamental subtask for knowledge graph construction and question-answering in the agricultural diseases and pests field. Although several works have been done, the scarcity of the Chinese annotated dataset has restricted the development of agricultural diseases and pests named entity recognition(ADP-NER). To address the issues, a large-scale corpus for the Chinese ADP-NER task named AgCNER was first annotated. It mainly contains 13 categories, 206,992 entities, and 66,553 samples with 3,909,293 characters. Compared with other datasets, AgCNER maintains the best performance in terms of the number of categories, entities, samples, and characters. Moreover, this is the first publicly available corpus for the agricultural field. In addition, the agricultural language model AgBERT is also fine-tuned and released. Finally, the comprehensive experimental results showed that BiLSTM-CRF achieved F1-score of 93.58%, which would be further improved to 94.14% using BERT. The analysis from multiple aspects has verified the rationality of AgCNER and the effectiveness of AgBERT. The annotated corpus and fine-tuned language model are publicly available at https://doi.org/XXX and https://github.com/guojson/AgCNER.git.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2593 - Yao 2024
Internal-External Information Enhanced Causal Reasoning

Yao, Y.; Yang, F.; Wang, K.; Zhou, X.

2024 International Joint Conference on Neural Networks (IJCNN) 2024;():1-8

2024

DOI: 10.1109/IJCNN60899.2024.10651415 · Ref ID: 6038

Causal reasoning is vitally important for various natural language processing, which needs text semantic understanding and rich knowledge information reserve. Causal question-answering (CQA), one of the causal reasoning tasks, aims to choose either the cause or effect of a given story sentence. It requires both background causal knowledge and the ability to infer cause-effect relations. However, existing studies ignore the logical and commonsense relationship between the contexts, which limits the model capability. In this paper, we propose a novel model of Semantic Internal-External Enhancement (SIEE) by enhancing both the internal and external knowledge. The model employs Abstract Meaning Representation (AMR) to capture the core semantic information and explicit structures. In addition, we explore the commonsense knowledge behind the key information in the context to provide more clues for reasoning. Finally, we combine the above internal and external information by using a semantic aggregator to aggregate the semantic information of neighbors on the keyword nodes. Experimental studies show the competitive performance of our proposed model over the state-of-the-art published results on three CQA benchmarks, e-CARE, COPA and BCOPA.

Srividya voted
Davis voted
Final decision
What was the agreed final decision?

#742 - Yao 2023
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction

Yao, Y. Z.; Mao, S. Y.; Zhang, N. Y.; Chen, X.; Deng, S. M.; Chen, H. J.; Acm

46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2023;():911-921

Taipei, TAIWAN Assoc Computing Machinery 2023

DOI: 10.1145/3539618.3591763 · Ref ID: 2952

With the development of pre-trained language models, many prompt-based approaches to data-efficient knowledge graph construction have achieved impressive performance. However, existing prompt-based learning methods for knowledge graph construction are still susceptible to several potential limitations: (i) semantic gap between natural language and output structured knowledge with pre-defined schema, which means model cannot fully exploit semantic knowledge with the constrained templates; (ii) representation learning with locally individual instances limits the performance given the insufficient features, which are unable to unleash the potential analogical capability of pre-trained language models. Motivated by these observations, we propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP), for data-efficient knowledge graph construction. It can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample, which is model-agnostic and can be plugged into widespread existing approaches. Experimental results demonstrate that previous methods integrated with RAP can achieve impressive performance gains in low-resource settings on five datasets of relational triple extraction and event extraction for knowledge graph construction(1).

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2066 - Yao 2022
Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context

Yao, Z.; Cao, Y.; Yang, Z.; Deshpande, V.; Yu, H.

AMIA Annu Symp Proc 2022;2022():1188-1197

2022

Ref ID: 5866

Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#3861 - Yao 2024
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

Yao, Zijun; Qi, Weijian; Pan, Liangming; Cao, Shulin; Hu, Linmei; Liu, Weichuan; Hou, Lei; Li, Juanzi

arXiv 2024;():

2024

Ref ID: 8430

This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states. SeaKR activates retrieval when the LLMs present high self-aware uncertainty for generation. To effectively integrate retrieved knowledge snippets, SeaKR re-ranks them based on LLM's self-aware uncertainty to preserve the snippet that reduces their uncertainty to the utmost. To facilitate solving complex tasks that require multiple retrievals, SeaKR utilizes their self-aware uncertainty to choose among different reasoning strategies. Our experiments on both complex and simple Question Answering datasets show that SeaKR outperforms existing adaptive RAG methods. We release our code at https://github.com/THU-KEG/SeaKR.

Srividya voted
yuexi voted
Final decision
What was the agreed final decision?

#159 - Yasunaga 2022
Deep Bidirectional Language-Knowledge Graph Pretraining

Yasunaga, M.; Bosselut, A.; Ren, H. Y.; Zhang, X. K.; Manning, C. D.; Liang, P.; Leskovec, J.

36th Conference on Neural Information Processing Systems (NeurIPS) 2022;():

Electr Network Neural Information Processing Systems (Nips) 2022

Ref ID: 3034

Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised method to pretrain a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction. DRAGON outperforms existing LM and LM+KG models on diverse downstream tasks including question answering across general and biomedical domains, with +5% absolute gain on average. In particular, DRAGON achieves strong performance on complex reasoning about language and knowledge (+10% on questions involving long contexts or multi-step reasoning) and low-resource QA (+8% on OBQA and RiddleSense), and new state-of-the-art results on various BioNLP tasks. Our code and trained models are available at https://github.com/michiyasunaga/dragon.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3221 - Ye 2023
Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction

Ye, Hongbin; Gui, Honghao; Zhang, Aijia; Liu, Tong; Hua, Wei; Jia, Weiqiang

arXiv 2023;():

2023

Ref ID: 7972

Knowledge graph construction (KGC) is a multifaceted undertaking involving the extraction of entities, relations, and events. Traditionally, large language models (LLMs) have been viewed as solitary task-solving agents in this complex landscape. However, this paper challenges this paradigm by introducing a novel framework, CooperKGC. Departing from the conventional approach, CooperKGC establishes a collaborative processing network, assembling a KGC collaboration team capable of concurrently addressing entity, relation, and event extraction tasks. Our experiments unequivocally demonstrate that fostering collaboration and information interaction among diverse agents within CooperKGC yields superior results compared to individual cognitive processes operating in isolation. Importantly, our findings reveal that the collaboration facilitated by CooperKGC enhances knowledge selection, correction, and aggregation capabilities across multiple rounds of interactions.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3794 - Ye 2023
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

Ye, Qichen; Liu, Junling; Chong, Dading; Zhou, Peilin; Hua, Yining; Liu, Fenglin; Cao, Meng; Wang, Ziming; Cheng, Xuxin; Lei, Zhu; Guo, Zhenhua

arXiv 2023;():

2023

Ref ID: 7894

Integrating large language models (LLMs) into healthcare holds great potential but faces challenges. Pre-training LLMs from scratch for domains like medicine is resource-heavy and often unfeasible. On the other hand, sole reliance on Supervised Fine-tuning (SFT) can result in overconfident predictions and may not tap into domain-specific insights. In response, we present a multi-stage training method combining Domain-specific Continued Pre-training (DCPT), SFT, and Direct Preference Optimization (DPO). In addition, we publish a 3Gb Chinese Medicine (ChiMed) dataset, encompassing medical question answering, plain texts, knowledge graphs, and dialogues, segmented into three training stages. The medical LLM trained with our pipeline, Qilin-Med, shows substantial performance improvement. In the CPT and SFT phases, Qilin-Med achieved 38.4% and 40.0% accuracy on the CMExam test set, respectively. It outperformed the basemodel Baichuan-7B (accuracy: 33.5%), by 7.5%. In the DPO phase, it scored 16.66 in BLEU-1 and 27.44 in ROUGE-1 on the Huatuo-26M test set, bringing further improvement to the SFT phase (12.69 in BLEU-1 and 24.21 in ROUGE-1). Additionally, we have further enhanced the model's performance through the Retrieval Augmented Generation (RAG) approach. Experiments demonstrate that Qilin-Med-RAG achieves an accuracy rate of 42.8% on CMExam. These results highlight the contribution of our novel training approach in building LLMs for medical applications.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2285 - Ye 2024
Correcting Factual Errors in LLMs via Inference Paths Based on Knowledge Graph

Ye, W.; Zhang, Q.; Zhou, X.; Hu, W.; Tian, C.; Cheng, J.

2024 International Conference on Computational Linguistics and Natural Language Processing (CLNLP) 2024;():12-16

2024

DOI: 10.1109/CLNLP64123.2024.00011 · Ref ID: 7013

Large language models (LLMs) have been observed to occasionally exhibit hallucination, a phenomenon where they generate statements unsupported by factual evidence, thereby compromising the trustworthiness of their output. Current approaches to mitigating this problem largely rely on extracting a single triplet from a knowledge graph, which fails to adequately capture the complex and interlinked nature of factual reasoning. In an effort to address this critical challenge, this paper delves into the utilization of inference paths based on knowledge graph for factual error correction of LLMs. At the heart of our approach lies the deployment of deep reinforcement learning algorithms, which traverse the knowledge graph to retrieve inference paths. These paths, replete with contextual depth and logical coherence, thereby amending the content and diminishing the incidence of factual discrepancies in the reasoning process of LLMs. Experimental results demonstrate that our approach markedly enhances the factual QA performance of LLMs. Furthermore, it shows great potential in improving the reliability of LLMs in complex reasoning scenarios, highlighting the effectiveness of inference path derived from knowledge graph.

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#3283 - Ye 2024
Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model

Ye, Yanpeng; Ren, Jie; Wang, Shaozhou; Wan, Yuwei; Wang, Haofen; Razzak, Imran; Hoex, Bram; Xie, Tong; Zhang, Wenjie

arXiv 2024;():

2024

Ref ID: 8216

Knowledge in materials science is widely dispersed across extensive scientific literature, posing significant challenges for efficient discovery and integration of new materials. Traditional methods, often reliant on costly and time-consuming experimental approaches, further complicate rapid innovation. Addressing these challenges, the integration of artificial intelligence with materials science has opened avenues for accelerating the discovery process, though it also demands precise annotation, data extraction, and traceability of information. To tackle these issues, this article introduces the Materials Knowledge Graph (MKG), which utilizes advanced natural language processing techniques, integrated with large language models to extract and systematically organize a decade's worth of high-quality research into structured triples, contains 162,605 nodes and 731,772 edges. MKG categorizes information into comprehensive labels such as Name, Formula, and Application, structured around a meticulously designed ontology, thus enhancing data usability and integration. By implementing network-based algorithms, MKG not only facilitates efficient link prediction but also significantly reduces reliance on traditional experimental methods. This structured approach not only streamlines materials research but also lays the groundwork for more sophisticated science knowledge graphs.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#981 - Yin 2023
ALCUNA: Large Language Models Meet New Knowledge

Yin, X.; Huang, B.; Wan, X.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():1397-1414

Association for Computational Linguistics (ACL) 2023

DOI: 10.18653/v1/2023.emnlp-main.87 · Ref ID: 5081

With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now. However, existing benchmarks may not adequately measure these models' capabilities, especially when faced with new knowledge. In this paper, we address the lack of benchmarks to evaluate LLMs' ability to handle new knowledge, an important and challenging aspect in the rapidly evolving world. We propose an approach called KnowGen that generates new knowledge by altering existing entity attributes and relationships, resulting in artificial entities that are distinct from real-world entities. With KnowGen, we introduce a benchmark named ALCUNA to assess LLMs' abilities in knowledge understanding, differentiation, and association. We benchmark several LLMs, reveals that their performance in face of new knowledge is not satisfactory, particularly in reasoning between new and internal knowledge. We also explore the impact of entity similarity on the model's understanding of entity knowledge and the influence of contextual entities. We appeal to the need for caution when using LLMs in new scenarios or with new knowledge, and hope that our benchmarks can help drive the development of LLMs in face of new knowledge. ©2023 Association for Computational Linguistics.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1017 - Yin 2024
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation

Yin, X.; Zhang, X.; Ruan, J.; Wan, X.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():2270-2286

Association for Computational Linguistics (ACL) 2024

Ref ID: 4306

In recent years, substantial advancements have been made in the development of large language models, achieving remarkable performance across diverse tasks. To evaluate the knowledge ability of language models, previous studies have proposed lots of benchmarks based on question-answering pairs. We argue that it is not reliable and comprehensive to evaluate language models with a fixed question or limited paraphrases as the query, since language models are sensitive to prompt. Therefore, we introduce a novel concept named knowledge boundary to encompass both prompt-agnostic and prompt-sensitive knowledge within language models. Knowledge boundary avoids prompt sensitivity in language model evaluations, rendering them more dependable and robust. To explore the knowledge boundary for a given model, we propose a projected gradient descent method with semantic constraints, a new algorithm designed to identify the optimal prompt for each piece of knowledge. Experiments demonstrate a superior performance of our algorithm in computing the knowledge boundary compared to existing methods. Furthermore, we evaluate the ability of multiple language models in several domains with knowledge boundary. © 2024 Association for Computational Linguistics.

Xinchen voted
Ishan voted
Final decision
What was the agreed final decision?

#1479 - Youn 2023
KGLM: Integrating Knowledge Graph Structure in Language Models for Link Prediction

Youn, J.; Tagkopoulos, I.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():217-224

Association for Computational Linguistics (ACL) 2023

DOI: 10.18653/v1/2023.starsem-1.20 · Ref ID: 5114

The ability of knowledge graphs to represent complex relationships at scale has led to their adoption for various needs including knowledge representation, question-answering, and recommendation systems. Knowledge graphs are often incomplete in the information they represent, necessitating the need for knowledge graph completion tasks. Pre-trained and finetuned language models have shown promise in these tasks although these models ignore the intrinsic information encoded in the knowledge graph, namely the entity and relation types. In this work, we propose the Knowledge Graph Language Model (KGLM) architecture, where we introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types, therefore allowing the model to learn the structure of the knowledge graph. In this work, we show that further pretraining the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets. © 2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#3375 - Youssef 2024
Enhancing Fact Retrieval in PLMs through Truthfulness

Youssef, Paul; Schlötterer, Jörg; Seifert, Christin

arXiv 2024;():

2024

Ref ID: 8722

Pre-trained Language Models (PLMs) encode various facts about the world at their pre-training phase as they are trained to predict the next or missing word in a sentence. There has a been an interest in quantifying and improving the amount of facts that can be extracted from PLMs, as they have been envisioned to act as soft knowledge bases, which can be queried in natural language. Different approaches exist to enhance fact retrieval from PLM. Recent work shows that the hidden states of PLMs can be leveraged to determine the truthfulness of the PLMs' inputs. Leveraging this finding to improve factual knowledge retrieval remains unexplored. In this work, we investigate the use of a helper model to improve fact retrieval. The helper model assesses the truthfulness of an input based on the corresponding hidden states representations from the PLMs. We evaluate this approach on several masked PLMs and show that it enhances fact retrieval by up to 33%. Our findings highlight the potential of hidden states representations from PLMs in improving their factual knowledge retrieval.

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#1251 - Yu 2024
Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration

Yu, H. C.; Shih, Y. A.; Law, K. M.; Hsieh, K. Y.; Cheng, Y. C.; Ho, H. C.; Lin, Z. A.; Hsu, W. C.; Fan, Y. C.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():11019-11029

Association for Computational Linguistics (ACL) 2024

Ref ID: 4241

In this paper, we tackle the task of distractor generation (DG) for multiple-choice questions. Our study introduces two key designs. First, we propose retrieval augmented pretraining, which involves refining the language model pretraining to align it more closely with the downstream task of DG. Second, we explore the integration of knowledge graphs to enhance the performance of DG. Through experiments with benchmarking datasets, we show that our models significantly outperform the state-ofthe-art results. Our best-performing model advances the F1@3 score from 14.80 to 16.47 in MCQ dataset and from 15.92 to 16.50 in Sciq dataset. © 2024 Association for Computational Linguistics.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2585 - Yu 2007
Intelligent Software Agent Design Tool Using Goal Net Methodology

Yu, H.; Shen, Z.; Miao, C.

2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'07) 2007;():43-46

2007

DOI: 10.1109/IAT.2007.25 · Ref ID: 7077

Intelligent agent is a fast emerging technology and has wide range of applications. Although there are several tools for agent development, there is few design tool to assist the conversion from paper based agent mental state design to effective representation of them in abstract data structures which can be used by the agent management system to create intelligent software agents. This paper proposes the goal net designer which is an integrated development environment (IDE) for modeling agent behavior based on goal net model, a goal orient methodology. It provides a way for the users to simplify the various stages of the design process and automatically generate design data which can be used by the multi- agent development environment (MADE) to automatically create intelligent agents. The system reduces the level of skills required for developing agent augmented application to such an extent that users with little knowledge in intelligent software agent technology can easily add intelligent agents into their applications and save time and cost involved in the development process.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3378 - Yu 2024
Enhancing Healthcare through Large Language Models: A Study on Medical Question Answering

Yu, Haoran; Yu, Chang; Wang, Zihan; Zou, Dongxian; Qin, Hao

arXiv 2024;():

2024

Ref ID: 8518

In recent years, the application of Large Language Models (LLMs) in healthcare has shown significant promise in improving the accessibility and dissemination of medical knowledge. This paper presents a detailed study of various LLMs trained on the MedQuAD medical question-answering dataset, with a focus on identifying the most effective model for providing accurate medical information. Among the models tested, the Sentence-t5 combined with Mistral 7B demonstrated superior performance, achieving a precision score of 0.762. This model's enhanced capabilities are attributed to its advanced pretraining techniques, robust architecture, and effective prompt construction methodologies. By leveraging these strengths, the Sentence-t5 + Mistral 7B model excels in understanding and generating precise medical answers. Our findings highlight the potential of integrating sophisticated LLMs in medical contexts to facilitate efficient and accurate medical knowledge retrieval, thus significantly enhancing patient education and support.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2212 - Yu 2010
Building a context world for dynamic service composition

Yu, L.; Glenstrup, A.; Su, S.; Zhang, Y.

5th International Conference on Pervasive Computing and Applications 2010;():336-341

2010

DOI: 10.1109/ICPCA.2010.5704123 · Ref ID: 6851

Dynamic service composition requires responding and adapting to changes in the computing environment when orchestrating existing services into one or more new services that fit better to a composite application. This paper abstracts the changes of the environment as a context world to store the physical contexts of the computing environment, user profiles and computed results of services as well. We use ontology techniques to model the domain concepts of application contexts. Context Condition/Effect Description Language is designed to describe the dynamic semantics of the requirements and capabilities of goals and services in a concise and editable manner. Goal-driven and planning techniques are used to dynamically implement the service composition according to the domain knowledge and facts in the context world.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#57 - Yu 2023
BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM

Yu, S.; Huang, T.; Liu, M. Y.; Wang, Z. J.

21st International Conference on Service-Oriented Computing (ICSOC) 2023;14419():339-346

Rome, ITALY Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-48421-6_23 · Ref ID: 3259

Knowledge graph (KG), as a novel knowledge storage approach, has been widely used in various domains. In the service computing community, researchers tried to harness the enormous potential of KG to tackle domain-specific tasks. However, the lack of an openly available service domain KG limits the in-depth exploration of KGs in domain-specific applications. Building a service domain KG primarily faces two challenges: first, the diversity and complexity of service domain knowledge, and second, the dispersion of domain knowledge and the lack of annotated data. These challenges discouraged costly investment in large, high-quality domain-specific KGs by researchers. In this paper, we present the construction of a service domain KG called BEAR. We design a comprehensive service domain knowledge ontology to automatically generate the prompts for the Large Language Model (LLM) and employ LLM to implement a zero-shot method to extract high-quality knowledge. A series of experiments are conducted to demonstrate the feasibility of graph construction process and showcase the richness of content available from BEAR. Currently, BEAR includes 133, 906 nodes, 169, 159 relations, and about 424, 000 factual knowledge as attributes, which is available through github.com/HTXone/BEAR.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3111 - Yu 2023
BEAR: Revolutionizing Service Domain Knowledge Graph Construction with&nbsp;LLM

Yu, Shuang; Huang, Tao; Liu, Mingyi; Wang, Zhongjie

Service-Oriented Computing: 21st International Conference, ICSOC 2023, Rome, Italy, November 28 – December 1, 2023, Proceedings, Part I 2023;():339–346

Rome, Italy Springer-Verlag 2023

DOI: 10.1007/978-3-031-48421-6_23 · Ref ID: 7110

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#743 - Yu 2023
The Second Workshop on Knowledge-Augmented Methods for Natural Language Processing

Yu, W. H.; Tong, L. B.; Shi, W. J.; Peng, N. Y.; Jiang, M.; Acm

29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2023;():5899-5900

Long Beach, CA Assoc Computing Machinery 2023

DOI: 10.1145/3580305.3599233 · Ref ID: 3238

Language models are being developed and deployed in many applications, "small"-scale and large-scale, generic and specialized, text-only and multimodal, etc. Meanwhile, the missingness of important knowledge causes limitations and safety challenges. The knowledge includes commonsense, world facts, domain expertise, personalization, and especially the unique patterns that need to be discovered from big data applications. Training and inference processes of the language models can be and should be augmented with the knowledge. The first KnowledgeNLP at AAAI 2023 attracted scientists on knowledge augmentation methods towards higher language intelligence. This workshop offers a broad platform to share ideas and discuss various topics, such as (1) synergy between knowledge and language model, (2) scalable architectures that integrate NLP, knowledge graph, and graph learning technologies, (3) KnowledgeNLP for e-commerce, education, and healthcare, (4) human factors and social good in KnowledgeNLP.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3725 - Yu 2023
A Multimodal Ecological Civilization Pattern Recommendation Method Based on Large Language Models and Knowledge Graph

Yu, Zhihang; Wang, Shu; Zhu, Yunqiang; Zou, Zhiqiang

arXiv 2023;():

2023

Ref ID: 7912

The Ecological Civilization Pattern Recommendation System (ECPRS) aims to recommend suitable ecological civilization patterns for target regions, promoting sustainable development and reducing regional disparities. However, the current representative recommendation methods are not suitable for recommending ecological civilization patterns in a geographical context. There are two reasons for this. Firstly, regions have spatial heterogeneity, and the (ECPRS)needs to consider factors like climate, topography, vegetation, etc., to recommend civilization patterns adapted to specific ecological environments, ensuring the feasibility and practicality of the recommendations. Secondly, the abstract features of the ecological civilization patterns in the real world have not been fully utilized., resulting in poor richness in their embedding representations and consequently, lower performance of the recommendation system. Considering these limitations, we propose the ECPR-MML method. Initially, based on the novel method UGPIG, we construct a knowledge graph to extract regional representations incorporating spatial heterogeneity features. Following that, inspired by the significant progress made by Large Language Models (LLMs) in the field of Natural Language Processing (NLP), we employ Large LLMs to generate multimodal features for ecological civilization patterns in the form of text and images. We extract and integrate these multimodal features to obtain semantically rich representations of ecological civilization. Through extensive experiments, we validate the performance of our ECPR-MML model. Our results show that F1@5 is 2.11% higher compared to state-of-the-art models, 2.02% higher than NGCF, and 1.16% higher than UGPIG. Furthermore, multimodal data can indeed enhance recommendation performance. However, the data generated by LLM is not as effective as real data to a certain extent.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3994 - Yuan 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models

Yuan, Hongbang; Cao, Pengfei; Jin, Zhuoran; Chen, Yubo; Zeng, Daojian; Liu, Kang; Zhao, Jun

arXiv 2024;():

2024

Ref ID: 8147

Large Language Models (LLMs) have shown impressive capabilities but still suffer from the issue of hallucinations. A significant type of this issue is the false premise hallucination, which we define as the phenomenon when LLMs generate hallucinated text when confronted with false premise questions. In this paper, we perform a comprehensive analysis of the false premise hallucination and elucidate its internal working mechanism: a small subset of attention heads (which we designate as false premise heads) disturb the knowledge extraction process, leading to the occurrence of false premise hallucination. Based on our analysis, we propose ????? (?alse premise ?ttention head constra?ining for mi?igating ?allucinations), a novel and effective method to mitigate false premise hallucinations. It constrains the false premise attention heads during the model inference process. Impressively, extensive experiments demonstrate that constraining only approximately 1% of the attention heads in the model yields a notable increase of nearly 20% of model performance.

Xinchen voted
Davis voted
Final decision
What was the agreed final decision?

#3167 - Yuan 2024
VisionKG: Unleashing the&nbsp;Power of&nbsp;Visual Datasets via&nbsp;Knowledge Graph

Yuan, Jicheng; Le-Tuan, Anh; Nguyen-Duc, Manh; Tran, Trung-Kien; Hauswirth, Manfred; Le-Phuoc, Danh

The Semantic Web: 21st International Conference, ESWC 2024, Hersonissos, Crete, Greece, May 26–30, 2024, Proceedings, Part II 2024;():75–93

Hersonissos, Greece Springer-Verlag 2024

DOI: 10.1007/978-3-031-60635-9_5 · Ref ID: 7139

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3635 - Yuan 2023
Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review

Yuan, Mingze; Bao, Peng; Yuan, Jiajia; Shen, Yunhao; Chen, Zifan; Xie, Yi; Zhao, Jie; Chen, Yang; Zhang, Li; Shen, Lin; Dong, Bin

arXiv 2023;():

2023

Ref ID: 7922

With the rapid development of artificial intelligence, large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning. This has sparked significant interest in applying LLMs to enhance various aspects of healthcare, ranging from medical education to clinical decision support. However, medicine involves multifaceted data modalities and nuanced reasoning skills, presenting challenges for integrating LLMs. This paper provides a comprehensive review on the applications and implications of LLMs in medicine. It begins by examining the fundamental applications of general-purpose and specialized LLMs, demonstrating their utilities in knowledge retrieval, research support, clinical workflow automation, and diagnostic assistance. Recognizing the inherent multimodality of medicine, the review then focuses on multimodal LLMs, investigating their ability to process diverse data types like medical imaging and EHRs to augment diagnostic accuracy. To address LLMs' limitations regarding personalization and complex clinical reasoning, the paper explores the emerging development of LLM-powered autonomous agents for healthcare. Furthermore, it summarizes the evaluation methodologies for assessing LLMs' reliability and safety in medical contexts. Overall, this review offers an extensive analysis on the transformative potential of LLMs in modern medicine. It also highlights the pivotal need for continuous optimizations and ethical oversight before these models can be effectively integrated into clinical practice. Visit https://github.com/mingze-yuan/Awesome-LLM-Healthcare for an accompanying GitHub repository containing latest papers.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#985 - Yuan 2024
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base

Yuan, S.; Chen, J.; Sun, C.; Liang, J.; Xiao, Y.; Yang, D.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():1249-1265

Association for Computational Linguistics (ACL) 2024

Ref ID: 4385

Analogical reasoning is a fundamental cognitive ability of humans. However, current language models (LMs) still struggle to achieve human-like performance in analogical reasoning tasks due to a lack of resources for model training. In this work, we address this gap by proposing ANALOGYKB, a million-scale analogy knowledge base (KB) derived from existing knowledge graphs (KGs). ANALOGYKB identifies two types of analogies from the KGs: 1) analogies of the same relations, which can be directly extracted from the KGs, and 2) analogies of analogous relations, which are identified with a selection and filtering pipeline enabled by large language models (LLMs), followed by minor human efforts for data quality control. Evaluations on a series of datasets of two analogical reasoning tasks (analogy recognition and generation) demonstrate that ANALOGYKB successfully enables both smaller LMs and LLMs to gain better analogical reasoning capabilities. Resources of this paper can be found at https://github.com/siyuyuan/analogykb. © 2024 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#91 - Yuan 2023
Causality-aware Concept Extraction based on Knowledge-guided Prompting

Yuan, S. Y.; Yang, D. Q.; Liu, J. X.; Tian, S. Y.; Liang, J. Q.; Xiao, Y. H.; Xie, R.

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():9255-9272

Toronto, CANADA Assoc Computational Linguistics-Acl 2023

Ref ID: 3394

Concepts benefit natural language understanding but are far from complete in existing knowledge graphs (KGs). Recently, pre-trained language models (PLMs) have been widely used in text-based concept extraction (CE). However, PLMs tend to mine the co-occurrence associations from massive corpus as pre-trained knowledge rather than the real causal effect between tokens. As a result, the pre-trained knowledge confounds PLMs to extract biased concepts based on spurious co-occurrence correlations, inevitably resulting in low precision. In this paper, through the lens of a Structural Causal Model (SCM), we propose equipping the PLM-based extractor with a knowledge-guided prompt as an intervention to alleviate concept bias. The prompt adopts the topic of the given entity from the existing knowledge in KGs to mitigate the spurious co-occurrence correlations between entities and biased concepts. Our extensive experiments on representative multilingual KG datasets justify that our proposed prompt can effectively alleviate concept bias and improve the performance of PLM-based CE models. The code has been released on https://github.com/ siyuyuan/KPCE.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3713 - Yun 2024
MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation

Yun, Sanggeon; Masukawa, Ryozo; Na, Minhyoung; Imani, Mohsen

arXiv 2024;():

2024

Ref ID: 8428

In the context of escalating safety concerns across various domains, the tasks of Video Anomaly Detection (VAD) and Video Anomaly Recognition (VAR) have emerged as critically important for applications in intelligent surveillance, evidence investigation, violence alerting, etc. These tasks, aimed at identifying and classifying deviations from normal behavior in video data, face significant challenges due to the rarity of anomalies which leads to extremely imbalanced data and the impracticality of extensive frame-level data annotation for supervised learning. This paper introduces a novel hierarchical graph neural network (GNN) based model MissionGNN that addresses these challenges by leveraging a state-of-the-art large language model and a comprehensive knowledge graph for efficient weakly supervised learning in VAR. Our approach circumvents the limitations of previous methods by avoiding heavy gradient computations on large multimodal models and enabling fully frame-level training without fixed video segmentation. Utilizing automated, mission-specific knowledge graph generation, our model provides a practical and efficient solution for real-time video analysis without the constraints of previous segmentation-based or multimodal approaches. Experimental validation on benchmark datasets demonstrates our model's performance in VAD and VAR, highlighting its potential to redefine the landscape of anomaly detection and recognition in video surveillance systems.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1390 - Yunqiu 2022
Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model

Yunqiu, Z.; Yang, W.; Bocheng, L.

Data. Anal. Knowl. Discov. 2022;6(2-3):242-250

2022

DOI: 10.11925/infotech.2096-3467.2021.0951 · Ref ID: 5304

[Objective] This paper proposes an entity recognition model based on RoBERTa-wwm dynamic fusion, aiming to improve the entity identification of Chinese electronic medical records. [Methods] First, we merged the semantic representations generated by each Transformer layer of the pre-trained language model RoBERTa-wwm. Then, we input the bi-directional long short-term memory network and the conditional random field module to recognize the entities of the electronic medical records. [Results] We examined our new model with the dataset of“2017 National Knowledge Graph and Semantic Computing Conference (CCKS 2017)”and self-annotated electronic medical records. Their F1 values reached 94.08% and 90.08%, which were 0.23% and 0.39% higher than the RoBERTa-wwm-BiLSTM-CRF model. [Limitations] The RoBERTa-wwm used in this paper completed the pre-training process with non-medical corpus. [Conclusions] The proposed method could improve the results of entity recognition tasks. © 2022, Chinese Academy of Sciences. All rights reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2354 - Yurin 2018
The domain-specific editor for rule-based knowledge bases

Yurin, A. Y.; Berman, A. F.; Nikolaychuk, O. A.; Dorodnykh, N. O.; Grishenko, M. A.

2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) 2018;():0961-0966

2018

DOI: 10.23919/MIPRO.2018.8400176 · Ref ID: 6039

The aim of the paper is to describe a domain-specific editor for the design of rule-based knowledge bases in the field of the prognosis of technical conditions and remaining operation time of petrochemical equipments. The architecture, main functions and a structure of files for configuration of the editor are presented. The feature of the editor is a semantic layer in the form of a platform-independent model. This layer provides to configure the editor with the account of features of a subjects domain. The semantic layer is implemented as a set of domain specific templates describing facts and rules (cause-and-effect relationships). These templates help to abstract from the syntax of certain knowledge representation languages (programming languages for knowledge bases, in particular, CLIPS - C Language Integrated Production System) and generate the graphic user interface elements.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#414 - Zafar 2024
KIMedQA: towards building knowledge-enhanced medical QA models

Zafar, A.; Sahoo, S. K.; Varshney, D.; Das, A.; Ekbal, A.

J. Intell. Inf. Syst. 2024;62(3):833-858

2024

DOI: 10.1007/s10844-024-00844-1 · Ref ID: 3230

Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question's textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system (KIMedQA), which employs two techniques viz. relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent (VSI) and Excessive Knowledge Information (EKI). The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#594 - Zahera 2022
MULTPAX: Keyphrase Extraction Using Language Models and Knowledge Graphs

Zahera, H. M.; Vollmers, D.; Sherif, M. A.; Ngomo, A. C. N.

21st International Semantic Web Conference (ISWC) 2022;13489():303-318

Electr Network Springer International Publishing Ag 2022

DOI: 10.1007/978-3-031-19433-7_18 · Ref ID: 2965

Keyphrase extraction aims to identify a small set of phrases that best describe the content of text. The automatic generation of keyphrases has become essential for many natural language applications such as text categorization, indexing, and summarization. In this paper, we propose MULTPAX, a multitask framework for extracting present and absent keyphrases using pre-trained language models and knowledge graphs. In particular, our framework contains three components: first, MULTPAX identifies present keyphrases from an input document. Then, MULTPAX links with external knowledge graphs to get more relevant phrases. Finally, MULTPAX ranks the extracted phrases based on their semantic relatedness to the input document and return top-k phrases as a final output. We conducted several experiments on four benchmark datasets to evaluate the performance of MULTPAX against different state-of-the-art baselines. The evaluation results demonstrate that our approach significantly outperforms the state-of-the-art baselines, with a significance t-test p < 0.041. Our source code and datasets are public available at https://github.com/dice-group/MultPAX.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3425 - Zavarella 2024
A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models

Zavarella, Vanni; Gamero-Salinas, Juan Carlos; Consoli, Sergio

arXiv 2024;():

2024

Ref ID: 8507

Knowledge graphs (KGs) have been successfully applied to the analysis of complex scientific and technological domains, with automatic KG generation methods typically building upon relation extraction models capturing fine-grained relations between domain entities in text. While these relations are fully applicable across scientific areas, existing models are trained on few domain-specific datasets such as SciERC and do not perform well on new target domains. In this paper, we experiment with leveraging in-context learning capabilities of Large Language Models to perform schema-constrained data annotation, collecting in-domain training instances for a Transformer-based relation extraction model deployed on titles and abstracts of research papers in the Architecture, Construction, Engineering and Operations (AECO) domain. By assessing the performance gain with respect to a baseline Deep Learning architecture trained on off-domain data, we show that by using a few-shot learning strategy with structured prompts and only minimal expert annotation the presented approach can potentially support domain adaptation of a science KG generation model.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#2042 - Zeng 2024
XLORE 3: A Large-Scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources

Zeng, K.; Jin, H.; Lv, X.; Zhu, F.; Hou, L.; Zhang, Y.; Pang, F.; Qi, Y.; Liu, D.; Li, J.; Feng, L.

ACM Trans. Inf. Syst. 2024;42(6):

2024

DOI: 10.1145/3660521 · Ref ID: 3928

In recent years, knowledge graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modeling and acquisition methods. In this article, we utilize systematic methods to improve XLORE's data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: (1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. (2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. (3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online at https://www.xlore.cn/, providing a valuable resource for researchers and practitioners in various fields. © 2024 Copyright held by the owner/author(s).

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#3348 - Zeng 2023
Domain Knowledge Graph Construction Via A Simple Checker

Zeng, Yueling; Wang, Li- C.

arXiv 2023;():

2023

Ref ID: 7875

With the availability of large language models, there is a growing interest for semiconductor chip design companies to leverage the technologies. For those companies, deployment of a new methodology must include two important considerations: confidentiality and scalability. In this context, this work tackles the problem of knowledge graph construction from hardware-design domain texts. We propose an oracle-checker scheme to leverage the power of GPT3.5 and demonstrate that the essence of the problem is in distillation of domain expert's background knowledge. Using RISC-V unprivileged ISA specification as an example, we explain key ideas and discuss practicality of our proposed oracle-checker approach.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3693 - Zha 2023
M²ConceptBase: A Fine-Grained Aligned Concept-Centric Multimodal Knowledge Base

Zha, Zhiwei; Wang, Jiaan; Li, Zhixu; Zhu, Xiangru; Song, Wei; Xiao, Yanghua

arXiv 2023;():

2023

Ref ID: 7990

Multimodal knowledge bases (MMKBs) provide cross-modal aligned knowledge crucial for multimodal tasks. However, the images in existing MMKBs are generally collected for entities in encyclopedia knowledge graphs. Therefore, detailed groundings of visual semantics with linguistic concepts are lacking, which are essential for the visual concept cognition ability of multimodal models. Addressing this gap, we introduce M²ConceptBase, the first concept-centric MMKB. M²ConceptBase models concepts as nodes with associated images and detailed textual descriptions. We propose a context-aware multimodal symbol grounding approach to align concept-image and concept-description pairs using context information from image-text datasets. Comprising 951K images and 152K concepts, M²ConceptBase links each concept to an average of 6.27 images and a single description, ensuring comprehensive visual and textual semantics. Human studies confirm more than 95% alignment accuracy, underscoring its quality. Additionally, our experiments demonstrate that M²ConceptBase significantly enhances VQA model performance on the OK-VQA task. M²ConceptBase also substantially improves the fine-grained concept understanding capabilities of multimodal large language models through retrieval augmentation in two concept-related tasks, highlighting its value.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3926 - Zhai 2023
Towards Faithful Knowledge Graph Explanation Through Deep Alignment in Commonsense Question Answering

Zhai, Weihe; Zubiaga, Arkaitz

arXiv 2023;():

2023

Ref ID: 7874

The fusion of language models (LMs) and knowledge graphs (KGs) is widely used in commonsense question answering, but generating faithful explanations remains challenging. Current methods often overlook path decoding faithfulness, leading to divergence between graph encoder outputs and model predictions. We identify confounding effects and LM-KG misalignment as key factors causing spurious explanations. To address this, we introduce the LM-KG Fidelity metric to assess KG representation reliability and propose the LM-KG Distribution-aware Alignment (????) algorithm to improve explanation faithfulness. Without ground truth, we evaluate KG explanations using the proposed Fidelity-Sparsity Trade-off Curve. Experiments on CommonsenseQA and OpenBookQA show that LKDA significantly enhances explanation fidelity and model performance, highlighting the need to address distributional misalignment for reliable commonsense reasoning.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3416 - Zhang 2024
Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction

Zhang, Bowen; Soh, Harold

arXiv 2024;():

2024

Ref ID: 8219

In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issue is that, in prior methods, the KG schema has to be included in the LLM prompt to generate valid triplets; larger and more complex schemas easily exceed the LLMs' context window length. Furthermore, there are scenarios where a fixed pre-defined schema is not available and we would like the method to construct a high-quality KG with a succinct self-generated schema. To address these problems, we propose a three-phase framework named Extract-Define-Canonicalize (EDC): open information extraction followed by schema definition and post-hoc canonicalization. EDC is flexible in that it can be applied to settings where a pre-defined target schema is available and when it is not; in the latter case, it constructs a schema automatically and applies self-canonicalization. To further improve performance, we introduce a trained component that retrieves schema elements relevant to the input text; this improves the LLMs' extraction performance in a retrieval-augmented generation-like manner. We demonstrate on three KGC benchmarks that EDC is able to extract high-quality triplets without any parameter tuning and with significantly larger schemas compared to prior works. Code for EDC is available at https://github.com/clear-nus/edc.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#582 - Zhang 2023
Multi-Faceted Knowledge-Driven Pre-Training for Product Representation Learning

Zhang, D. H.; Liu, Y. C.; Yuan, Z. X.; Fu, Y. J.; Chen, H. F.; Xiong, H.

IEEE Trans. Knowl. Data Eng. 2023;35(7):7239-7250

2023

DOI: 10.1109/tkde.2022.3200921 · Ref ID: 3752

As a key component of e-commerce computing, product representation learning (PRL) provides benefits for a variety of applications, including product matching, search, and categorization. The existing PRL approaches have poor language understanding ability due to their inability to capture contextualized semantics. In addition, the learned representations by existing methods are not easily transferable to new products. Inspired by the recent advance of pre-trained language models (PLMs), we make the attempt to adapt PLMs for PRL to mitigate the above issues. In this article, we develop KINDLE, a Knowledge-drIven pre-trainiNg framework for proDuct representation LEarning, which can preserve the contextual semantics and multi-faceted product knowledge robustly and flexibly. Specifically, we first extend traditional one-stage pre-training to a two-stage pre-training framework, and exploit a deliberate knowledge encoder to ensure a smooth knowledge fusion into PLM. In addition, we propose a multi-objective heterogeneous embedding method to represent thousands of knowledge elements. This helps KINDLE calibrate knowledge noise and sparsity automatically by replacing isolated classes as training targets in knowledge acquisition tasks. Furthermore, an input-aware gating network is proposed to select the most relevant knowledge for different downstream tasks. Finally, extensive experiments have demonstrated the advantages of KINDLE over the state-of-the-art baselines across three downstream tasks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#774 - Zhang 2024
SimRE: Simple contrastive learning with soft logical rule for knowledge graph embedding

Zhang, D.; Rong, Z.; Xue, C. Y.; Li, G. Y.

Inf. Sci. 2024;661():14

2024

DOI: 10.1016/j.ins.2023.120069 · Ref ID: 2990

Knowledge graphs serve as a pivotal framework for the structured representation of information regarding entities and relations. However, in the real world, these knowledge graphs are often incomplete and harboring missing facts. Knowledge graph completion (KGC) has emerged as a central research focus, entailing the automated prediction of these missing facts and garnering substantial scholarly attention in recent years. Text -based knowledge graph embedding methods have demonstrated considerable potential for tackling the challenges associated with KGC by employing pre -trained language models. However, their limitation lies in the lack of logical features, which constrains the efficacy of capturing intricate patterns within knowledge graphs. This paper proposed SimRE, a straightforward contrastive learning framework augmented with soft logic rules. SimRE introduces a self -supervised framework that leverages the input rule bodies to predict the corresponding rule heads through a contrastive objective. We introduced two rule sampling techniques to enhance the efficiency and accuracy of the model: in -batch rule negatives and pre -batch rule negatives. SimRE employs a simple method for integrating logical features with the text -based model. The experimental results on benchmark datasets demonstrate that the proposed approach outperforms state-of-the-art methods.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1552 - Zhang 2000
Language model for multilingual natural language generation

Zhang, Dongmo; Ge, Yong; Yao, Tianfang

Shanghai Jiaotong Daxue Xuebao 2000;34(7):944-947

2000

Ref ID: 5828

This paper introduced a knowledge representation model for multilingual natural language text generation system. It can be divided into two levels: semantic resources and syntax resources. The former is used in describing text content by schema and optimized rules; the latter is used in constructing sentence pattern, mapping language resource and determining text specific form according to the sentence structure class, syntactic rules and lexical information. The model is based on a complex feature set. It can be used to extend the abstract initial semantic data to all kinds of language resource for multilingual text generation.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#3520 - Zhang 2024
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models

Zhang, Fuxiang; Li, Junyou; Li, Yi-Chen; Zhang, Zongzhang; Yu, Yang; Ye, Deheng

arXiv 2024;():

2024

Ref ID: 8446

Low sample efficiency is an enduring challenge of reinforcement learning (RL). With the advent of versatile large language models (LLMs), recent works impart common-sense knowledge to accelerate policy learning for RL processes. However, we note that such guidance is often tailored for one specific task but loses generalizability. In this paper, we introduce a framework that harnesses LLMs to extract background knowledge of an environment, which contains general understandings of the entire environment, making various downstream RL tasks benefit from one-time knowledge representation. We ground LLMs by feeding a few pre-collected experiences and requesting them to delineate background knowledge of the environment. Afterward, we represent the output knowledge as potential functions for potential-based reward shaping, which has a good property for maintaining policy optimality from task rewards. We instantiate three variants to prompt LLMs for background knowledge, including writing code, annotating preferences, and assigning goals. Our experiments show that these methods achieve significant sample efficiency improvements in a spectrum of downstream tasks from Minigrid and Crafter domains.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#861 - Zhang 2023
User-Centric Conversational Recommendation: Adapting the Need of User with Large Language Models

Zhang, G. Y.; Acm

17th ACM Conference on Recommender Systems (RecSys) 2023;():1349-1354

Singapore, SINGAPORE Assoc Computing Machinery 2023

DOI: 10.1145/3604915.3608885 · Ref ID: 3327

Conversational recommender systems (CRS) promise to provide a more natural user experience for exploring and discovering items of interest through ongoing conversation. However, effectively modeling and adapting to users' complex and changing preferences remains challenging. This research develops user-centric methods that focus on understanding and adapting to users throughout conversations to provide the most helpful recommendations. First, a graph-based Conversational Path Reasoning (CPR) framework is proposed that represents dialogs as interactive reasoning over a knowledge graph to capture nuanced user interests and explain recommendations. To further enhance relationship modeling, graph neural networks are incorporated for improved representation learning. Next, to address uncertainty in user needs, the Vague Preference Multi-round Conversational Recommendation (VPMCR) scenario and matching Adaptive Vague Preference Policy Learning (AVPPL) solution are presented using reinforcement learning to tailor recommendations to evolving preferences. Finally, opportunities to leverage large language models are discussed to further advance user experiences via advanced user modeling, policy learning, and response generation. Overall, this research focuses on designing conversational recommender systems that continuously understand and adapt to users' ambiguous, complex and changing needs during natural conversations.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#1421 - Zhang 2023
Integrating Automated Knowledge Extraction with Large Language Models for Explainable Medical Decision-Making

Zhang, H.; Li, J.; Wang, Y.; Songi, Y.

Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():1710-1717

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BIBM58861.2023.10385557 · Ref ID: 4985

Large language models (LLMs) have demonstrated strong reasoning ability and inspired many previously unimaginable applications. In this paper, we aim to harness the strong reasoning capability of LLMs toward explainable medical diagnosis. As we know, deep learning has been widely adopted and shown improvement in medical diagnostics. However, it is often criticized for its lack of interpretability. To address this drawback, we propose the first method that innovatively combines Markov logic networks (MLNs) with external knowledge extracted using LLMs, aiming for improved both interpretability and accuracy. Specifically, our approach involves a new process, powered by LLMs and a search engine, to automatically collect and organize external medical knowledge. The outcome is a set of first-order logic (FOL) rules, which then become a key component for the following MLN-based diagnostic algorithm. The resulting MLN-based model can maintain the accuracy of deep networks while providing understandable reasoning for its decisions. By aiming to blend specific knowledge from the medical domain with LLM techniques, our work contributes towards the development of improved automatic diagnosis systems, with the potential for enhancing transparency and trust in medical diagnostics. © 2023 IEEE.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#3241 - Zhang 2023
CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation

Zhang, Hongbo; Tang, Chen; Loakman, Tyler; Yang, Bohao; Goetze, Stefan; Lin, Chenghua

arXiv 2023;():

2023

Ref ID: 7700

Commonsense knowledge is crucial to many natural language processing tasks. Existing works usually incorporate graph knowledge with conventional graph neural networks (GNNs), resulting in a sequential pipeline that compartmentalizes the encoding processes for textual and graph-based knowledge. This compartmentalization does, however, not fully exploit the contextual interplay between these two types of input knowledge. In this paper, a novel context-aware graph-attention model (Context-aware GAT) is proposed, designed to effectively assimilate global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism. Specifically, the proposed framework employs an innovative approach to representation learning that harmonizes heterogeneous features by amalgamating flattened graph knowledge with text data. The hierarchical application of graph knowledge aggregation within connected subgraphs, complemented by contextual information, to bolster the generation of commonsense-driven dialogues is analyzed. Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance. Both, automated and human evaluations affirm the significant performance enhancements achieved by our proposed model over the concept flow baseline.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1547 - Zhang 2024
LA-UCL: LLM-Augmented Unsupervised Contrastive Learning Framework for Few-Shot Text Classification

Zhang, J.; Gao, H.; Zhang, P.; Feng, B.; Deng, W.; Hou, Y.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():10198-10207

European Language Resources Association (ELRA) 2024

Ref ID: 4597

The few-shot tasks require the model to have the ability to generalize from a few samples. However, due to the lack of cognitive ability, the current works cannot fully utilize limited samples to expand the sample space and still suffer from overfitting issues. To address the problems, we propose a LLM-Augmented Unsupervised Contrastive Learning Framework (LA-UCL), which introduces a cognition-enabled Large Language Model (LLM) for efficient data augmentation, and presents corresponding contrastive learning strategies. Specifically, in the self-augmented contrastive learning module, we construct a retrieval-based in-context prompt scheme by retrieving similar but different category data from the original samples, guiding the LLM to generate more discriminative augmented data. Then, by designing group-level contrastive loss to enhance the model's discriminative ability. In the external-augmented contrastive learning module, we utilize web knowledge retrieval to expand the sample space and leverage LLM to generate more diverse data, and introduce sample-level contrastive loss for unlabeled data to improve the model's generalization. Experimental results on six datasets show that our model exceeds the baseline models. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1452 - Zhang 2024
A Joint Method for Combat Intent Recognition and Key Information Extraction

Zhang, J.; Lu, L.; Jiang, G.; Yuan, C.; Zhang, H.; Zheng, S.

Communications in Computer and Information Science 2024;2018 CCIS():115-125

Springer Science and Business Media Deutschland GmbH 2024

DOI: 10.1007/978-981-97-0844-4_9 · Ref ID: 4728

To alleviate the problems of poor quality and low efficiency in traditional combat plan making, we propose an intelligent combat plan generation method based on Bert pre-trained language model. First, we studied practical combat scenarios and military related websites, and constructed a military domain combat intent dataset that includes structured information such as combat categories, objects, and scenarios. Second, we utilize Bert pre-trained language model for semantic analysis of requirements, TextCNN (Convolutional Neural Network for Text) for combat intent recognition, and BiLSTM (Bidirectional Long Short-Term Memory) for key information extracting and entity normalization. Thus, based on the intent and key information, candidate schemes can be retrieved from the knowledge graph in the field of military operations in the future. Compared with traditional methods, the scheme quality and generation efficiency are significantly improved. This study provides an effective approach for intelligent decision support in the military field, and also offers references for intelligent scheme generation in other domains. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#422 - Zhang 2022
A knowledge extraction framework for domain-specific application with simplified pre-trained language model and attention-based feature extractor

Zhang, J.; Qin, B.; Zhang, Y. F.; Zhou, J. H.; Wang, H. W.

Serv. Oriented Comput. Appl. 2022;16(2):121-131

2022

DOI: 10.1007/s11761-022-00337-5 · Ref ID: 3618

With the advancement of industrial informatics, intelligent algorithms are increasingly applied in various industrial products and applications. In this paper, we proposed a knowledge extraction framework for domain-specific text. This framework can extract entities from text the subsequent tasks such as knowledge graph construction. The proposed framework contains three modules, namely domain feature pre-trained model, LSTM-based named entity recognition and the attention-based nested named entity recognition. The domain feature pre-trained model can effectively learn the features of domain corpus such as professional terms that are not included in the general domain corpus. Flat named entity recognition can use the vector from pre-trained model to obtain the entity from domain-specific text. The nested named entity recognition based on the attention mechanism and the weight sliding balance strategy can effectively identify entity types with higher nesting rates. The framework achieves good results in the field of nuclear power plant maintenance reports, and the methods for domain pre-trained model and LSTM-based flat named entity recognition have been successfully applied to practical tasks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#286 - Zhang 2021
A Framework for Effective Knowledge Extraction from A Data Space Formed by Unstructured Technical Reports using Pre-trained Models

Zhang, J.; Qin, B.; Zhang, Y. F.; Zhou, J. H.; Wang, H. W.; Ieee

17th IEEE International Conference on E-Business Engineering (ICEBE) 2021;():120-125

S China Univ Technol, Guangzhou, PEOPLES R CHINA Ieee 2021

DOI: 10.1109/icebe52470.2021.00028 · Ref ID: 3217

The transformation of unstructured data into triples is a key task in knowledge graph construction. It remains a great challenge to complete this task on technical reports. In this work, we propose a framework for effectively structuring data structuring in knowledge graph construction from a data space formed by technical reports. This framework specifically consist of two pre-trained language models to provide the embeddings and a sequence labeling model to tag the entity labels. The pre-trained models, i.e. the Flair embedding and the BERT model, are employed to combine the output vectors to downstream tasks. To evaluate the proposed method, we conduct named entity recognition experiments using the status reports of complex equipment in nuclear power plants. The evaluation shows the framework achieves remarkable improvement on F1 score. This paper details the framework, the experiments, and the evaluation of the proposed method.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1837 - Zhang 2021
RoKGDS: A Robust Knowledge Grounded Dialog System

Zhang, J.; Sun, Y.; Zhang, Y.; Xu, W.; Ying, J.; Yang, Y.; Lan, M.; Ma, M.; Yuan, H.; Zhu, J.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2021;13029 LNAI():377-387

Springer Science and Business Media Deutschland GmbH 2021

DOI: 10.1007/978-3-030-88483-3_30 · Ref ID: 5632

In this paper, we propose a pre-training based Robust Know-ledge Grounded Dialog System (RoKGDS) to enhance the performance of the model in unknown scenarios, which is easily generalized to various knowledge grounded dialog tasks, such as persona dialog, knowledge dialog, recommendation dialog. We use a bucket encoder to efficiently extract all kinds of knowledge information (e.g. profile, knowledge graph, and dialog goal). To improve the robustness of the model, we develop a hybrid decoder with a hybrid attention and a copy mechanism. The hybrid attention is an adaptation scheme to apply the pre-trained language model to our model and the copy mechanism is a gate mechanism to control generating a word from generic vocabulary or the input knowledge. Experiments show that our model is more robust than the other baseline models. Furthermore, we use visualization to explain the effectiveness of the hybrid attention compared to other two adaptation schemes. In the 2021 Language and Intelligence Challenge: Multi-Skill Dialog task, our best model ranked 3rd in the automatic evaluation stage and 5th in the human evaluation stage. © 2021, Springer Nature Switzerland AG.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1066 - Zhang 2024
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles

Zhang, J.; Xu, C.; Li, B.

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2024;():15459-15469

IEEE Computer Society 2024

DOI: 10.1109/CVPR52733.2024.01464 · Ref ID: 4108

We present ChatScene, a Large Language Model (LLM)-based agent that leverages the capabilities of LLMs to gener-ate safety-critical scenarios for autonomous vehicles. Given unstructured language instructions, the agent first generates textually described traffic scenarios using LLMs. These scenario descriptions are subsequently broken down into several sub-descriptions for specified details such as behaviors and locations of vehicles. The agent then distinctively transforms the textually described sub-scenarios into domain-specific languages, which then generate actual code for prediction and control in simulators, facilitating the creation of diverse and complex scenarios within the CARLA simulation environment. A key part of our agent is a comprehensive knowledge retrieval component, which efficiently translates specific textual descriptions into corresponding domain-specific code snippets by training a knowledge database containing the scenario description and code pairs. Extensive experimental results underscore the efficacy of ChatScene in improving the safety of autonomous vehicles. For instance, the scenarios generated by ChatScene show a 15% increase in collision rates compared to state-of-the-art baselines when tested against different reinforcement learning-based ego vehicles. Furthermore, we show that by using our generated safety-critical scenarios to fine-tune different RL-based autonomous driving models, they can achieve a 9% reduction in collision rates, surpassing current SOTA methods. ChatScene effectively bridges the gap between textual descriptions of traffic scenarios and practical CARLA simulations, providing a unified way to conveniently generate safety-critical scenarios for safety testing and improvement for AVs. The code is available at https://github.com/javyduck/ChatScene. © 2024 IEEE.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1611 - Zhang 2023
A LLM-Based Simulation Scenario Aided Generation Method

Zhang, J.; Zhang, Y.; Chu, M.; Yang, S.; Zu, T.

ITOEC 2023 - IEEE 7th Information Technology and Mechatronics Engineering Conference 2023;():1350-1354

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ITOEC57671.2023.10291525 · Ref ID: 5020

In the simulation training system, the generation of simulation scenarios is a basic problem that needs to be studied. Firstly, expounds on the technical characteristics of LLM and knowledge graph; then structurally describe the simulation scenario related content, and build scenario knowledge graph; according to the characteristics of scenario aided generation, a simulation scenario generation method based on LLM is proposed, which uses prompt to fuse knowledge graph and LLM, next, the implementation steps of this method were elaborated; finally, the specific application proves that the method proposed in this paper is a good reference for the generation of simulation scenarios. © 2023 IEEE.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3968 - Zhang 2022
Utilizing Background Knowledge for Robust Reasoning over Traffic Situations

Zhang, Jiarui; Ilievski, Filip; Kollaa, Aravinda; Francis, Jonathan; Ma, Kaixin; Oltramari, Alessandro

arXiv 2022;():

2022

Ref ID: 7627

Understanding novel situations in the traffic domain requires an intricate combination of domain-specific and causal commonsense knowledge. Prior work has provided sufficient perception-based modalities for traffic monitoring, in this paper, we focus on a complementary research aspect of Intelligent Transportation: traffic understanding. We scope our study to text-based methods and datasets given the abundant commonsense knowledge that can be extracted using language models from large corpus and knowledge graphs. We adopt three knowledge-driven approaches for zero-shot QA over traffic situations, based on prior natural language inference methods, commonsense models with knowledge graph self-supervision, and dense retriever-based models. We constructed two text-based multiple-choice question answering sets: BDD-QA for evaluating causal reasoning in the traffic domain and HDT-QA for measuring the possession of domain knowledge akin to human driving license tests. Among the methods, Unified-QA reaches the best performance on the BDD-QA dataset with the adaptation of multiple formats of question answers. Language models trained with inference information and commonsense knowledge are also good at predicting the cause and effect in the traffic domain but perform badly at answering human-driving QA sets. For such sets, DPR+Unified-QA performs the best due to its efficient knowledge extraction.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#3580 - Zhang 2024
KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

Zhang, Jiawei; Xu, Chejian; Gai, Yu; Lecue, Freddy; Song, Dawn; Li, Bo

arXiv 2024;():

2024

Ref ID: 8215

This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism. As LLMs are increasingly applied across various domains, ensuring that their outputs are not hallucinated is critical. Recognizing the limitations of existing approaches that either rely on the self-consistency check of LLMs or perform post-hoc fact-checking without considering the complexity of queries or the form of knowledge, KnowHalu proposes a two-phase process for hallucination detection. In the first phase, it identifies non-fabrication hallucinations–responses that, while factually correct, are irrelevant or non-specific to the query. The second phase, multi-form based factual checking, contains five key steps: reasoning and query decomposition, knowledge retrieval, knowledge optimization, judgment generation, and judgment aggregation. Our extensive evaluations demonstrate that KnowHalu significantly outperforms SOTA baselines in detecting hallucinations across diverse tasks, e.g., improving by 15.65% in QA tasks and 5.50% in summarization tasks, highlighting its effectiveness and versatility in detecting hallucinations in LLM-generated content.

yuexi voted
Mike voted
Final decision
What was the agreed final decision?

#4002 - Zhang 2024
Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models

Zhang, Jinbin; Ullah, Nasib; Babbar, Rohit

arXiv 2024;():

2024

Ref ID: 8381

Extreme Multi-label Learning (XMC) is a task that allocates the most relevant labels for an instance from a predefined label set. Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided; only the instances (raw text of the document) and the predetermined label set are given. The scenario is designed to address cold-start problems in categorization and recommendation. Traditional state-of-the-art methods extract pseudo labels from the document title or segments. These labels from the document are used to train a zero-shot bi-encoder model. The main issue with these generated labels is their misalignment with the tagging task. In this work, we propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM), the bi-encoder model encodes the document and labels into embeddings for retrieval. Our approach leverages the zero-shot ability of LLM to assess the correlation between labels and the document instead of using the low-quality labels extracted from the document itself. Our method also guarantees fast inference without the involvement of LLM. The performance of our approach outperforms the SOTA methods on various datasets while retaining a similar training time for large datasets.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#2128 - Zhang 2024
The Application of Fine-Tuning on Pretrained Language Model in Information Extraction for Fault Knowledge Graphs

Zhang, K.; Su, F.; Huang, Y.; Li, Y.; Wu, F.; Mao, Y.

2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP) 2024;():469-473

2024

DOI: 10.1109/ICSP62122.2024.10743881 · Ref ID: 6909

Constructing fault knowledge graphs holds significant importance for achieving intelligent maintenance and diagnosis in high-end equipment manufacturing. Effective information extraction and knowledge graph construction have proven challenging due to the lack of standardized representation of semantically complex unstructured text in the industrial domain. Therefore, in this study, we performed fine-tuning on the pre-trained language model (ChatGLM2-6B) with specific prompts to achieve information extraction from fault-related texts, ultimately leading to the construction of a fault knowledge graph. Experimental results demonstrate that the proposed method not only supports fine-tuning with limited data but also exhibits enhanced capability in understanding complex semantics related to fault symptoms and causes.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#9 - Zhang 2024
Advancing building energy modeling with large language models: Exploration and case studies

Zhang, L.; Chen, Z. L.; Ford, V.

Energy Build. 2024;323():19

2024

DOI: 10.1016/j.enbuild.2024.114788 · Ref ID: 3474

The rapid progression in artificial intelligence has facilitated the emergence of large language models like ChatGPT, offering potential applications extending into specialized engineering modeling, especially physicsbased building energy modeling. This paper investigates the innovative integration of large language models with building energy modeling software, focusing specifically on the fusion of ChatGPT with EnergyPlus. A literature review is first conducted to reveal a growing trend of incorporating large language models in engineering modeling, albeit limited research on their application in building energy modeling. We underscore the potential of large language models in addressing building energy modeling challenges and outline potential applications including simulation input generation, simulation output analysis and visualization, conducting error analysis, co-simulation, simulation knowledge extraction and training, and simulation optimization. Three case studies reveal the transformative potential of large language models in automating and optimizing building energy modeling tasks, underscoring the pivotal role of artificial intelligence in advancing sustainable building practices and energy efficiency. The case studies demonstrate that selecting the right large language model techniques is essential to enhance performance and reduce engineering efforts. The findings advocate a multidisciplinary approach in future artificial intelligence research, with implications extending beyond building energy modeling to other specialized engineering modeling.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2127 - Zhang 2009
Application of Decision-making Ontology for automotive body assembly design

Zhang, L.; Gao, L.; Shao, X. Y.

2009 IEEE International Conference on Industrial Engineering and Engineering Management 2009;():764-768

2009

DOI: 10.1109/IEEM.2009.5372924 · Ref ID: 6468

Automotive body assembly design in conceptual design stage is a complex group decision-making process which involves knowledge communications and projects selections. Ontology is a useful tool to abstract vague and linguistic knowledge. In this paper, a project-handling module based decision-making ontology model (PMDOM) is proposed to accelerate communication of decision-making and help for the selection of projects. Based on PMDOM, an automotive body assembly domain ontology model is set up and the decision-making support system (DSS) has been implemented by using JAVA. It shows that the DSS with PMDOM is better than others without PMDOM.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3632 - Zhang 2024
Large Language Models as Event Forecasters

Zhang, Libo; Ning, Yue

arXiv 2024;():

2024

Ref ID: 8386

Key elements of human events are extracted as quadruples that consist of subject, relation, object, and timestamp. This representation can be extended to a quintuple by adding a fifth element: a textual summary that briefly describes the event. These quadruples or quintuples, when organized within a specific domain, form a temporal knowledge graph (TKG). Current learning frameworks focus on a few TKG-related tasks, such as predicting an object given a subject and a relation or forecasting the occurrences of multiple types of events (i.e., relation) in the next time window. They typically rely on complex structural and sequential models like graph neural networks (GNNs) and recurrent neural networks (RNNs) to update intermediate embeddings. However, these methods often neglect the contextual information inherent in each quintuple, which can be effectively captured through concise textual descriptions. In this paper, we investigate how large language models (LLMs) can streamline the design of TKG learning frameworks while maintaining competitive accuracy in prediction and forecasting tasks. We develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder generative LLM. For multi-event forecasting (MEF), we design simple yet effective prompt templates for each TKG quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are subsequently processed by a prediction head with a self-attention mechanism to forecast potential future relations. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness and robustness of our approach.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1214 - Zhang 2022
DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering

Zhang, M.; Dai, R.; Dong, M.; He, T.

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5123-5133

Association for Computational Linguistics (ACL) 2022

Ref ID: 5411

In recent years, Graph Neural Network (GNN) approaches with enhanced knowledge graphs (KG) perform well in question answering (QA) tasks. One critical challenge is how to effectively utilize interactions between the QA context and KG. However, existing work only adopts the identical QA context representation to interact with multiple layers of KG, which results in a restricted interaction. In this paper, we propose DRLK (Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs), a novel model that utilizes dynamic hierarchical interactions between the QA context and KG for reasoning. DRLK extracts dynamic hierarchical features in the QA context, and performs inter-layer and intra-layer interactions on each iteration, allowing the KG representation to be grounded with the hierarchical features of the QA context. We conduct extensive experiments on four benchmark datasets in medical QA and commonsense reasoning. The experimental results demonstrate that DRLK achieves state-of-the-art performances on two benchmark datasets and performs competitively on the others. © 2022 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1380 - Zhang 2024
How Language Model Hallucinations Can Snowball

Zhang, M.; Press, O.; Merrill, W.; Liu, A.; Smith, N. A.

Proceedings of Machine Learning Research 2024;235():59670-59684

ML Research Press 2024

Ref ID: 4400

A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.Hallucinations are often attributed to knowledge gaps in LMs, but we show that LMs sometimes produce hallucinations that they can separately recognize as incorrect.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim.Crucially, we find that GPT-3.5, GPT-4, and LLaMA2-70B-chat can identify 67%, 87%, and 94% of these incorrect claims, respectively.We show that this phenomenon doesn't disappear under higher temperatures sampling, beam search, and zero-shot chain-of-thought prompting.These findings reveal that LM hallucinations can snowball: early mistakes by an LM can lead to more mistakes that otherwise would not be made. Copyright 2024 by the author(s)

Xinchen voted
Srividya voted
Final decision
What was the agreed final decision?

#424 - Zhang 2024
Knowledge graph accuracy evaluation: an LLM-enhanced embedding approach

Zhang, M. T.; Yang, G. L.; Liu, Y.; Shi, J.; Bai, X. Y.

Int. J, Data Sci. Anal. 2024;():15

2024

DOI: 10.1007/s41060-024-00661-3 · Ref ID: 2934

As an effective way for knowledge representation and knowledge storage, knowledge graph has been widely used in various fields. However, with the rapid increase of scale and volume of various knowledge graphs, there will inevitably be some knowledge quality matters. To evaluate the accuracy of knowledge graph effectively and efficiently, a common paradigm is to match the facts in knowledge graph with specific external knowledge. In this study, an LLM-enhanced (large language model enhanced) embedding framework is designed, integrating the verification ability of large language models to further evaluate the embedding results. First an optimized embedding model is proposed to make use of knowledge graph's internal structural information to measure whether the relation of a given triplet is probably founded. Then, the triplets which have less paths to support themselves are selected as the questionable ones, as their correctness cannot be determined confidently. Finally, the questionable triplets are filtered, and LLMs are adopted for further fact verification as external knowledge. The above three parts are aggregated to achieve the automated, accurate and efficient evaluation for knowledge graphs.

Kwesi voted
Davis voted
Final decision
What was the agreed final decision?

#1494 - Zhang 2022
Knowledge Collaborative Fine-tuning for Low-resource Knowledge Graph Completion

Zhang, N. Y.; Xie, X.; Chen, X.; Deng, S. M.; Ye, H. B.; Chen, H. J.

Ruan Jian Xue Bao 2022;33(10):3531-3545

2022

DOI: 10.13328/j.cnki.jos.006628 · Ref ID: 5402

Knowledge graph completion can make the knowledge graph more complete. Unfortunately, most of existing methods on knowledge graph completion assume that the entities or relations in the knowledge graph have sufficient triple instances. Nevertheless, there are great deals of long-tail triple sin general domains. Furthermore, it is challenging to obtain a large amount of high-quality annotation data in vertical domains. To address these issues, a knowledge collaborative fine-tuning approach is proposed for low-resource knowledge graph completion. The structured knowledge is leveraged to construct the initial prompt template and the optimal templates, labels, and model parameters are learnt through a collaborative fine-tuning algorithm. The proposed method leverages the explicit structured knowledge in the knowledge graph and the implicit triple knowledge from the language model, which can be applied to the tasks of link prediction and relation extraction. Experimental results show that the proposed approach can obtain state-of-the-art performance on three knowledge graph reasoning datasets and five relation extraction datasets. © 2022 Chinese Academy of Sciences. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#2607 - Zhang 2019
Knowledge Adaptive Neural Network for Natural Language Inference

Zhang, Q.; Yang, Y.; Chen, C.; He, L.; Yu, Z.

2019 International Joint Conference on Neural Networks (IJCNN) 2019;():1-8

2019

DOI: 10.1109/IJCNN.2019.8851884 · Ref ID: 6246

Natural language inference (NLI) has received widespread attention in recent years due to its contribution to various natural language processing tasks, such as question answering, abstract text summarization, and video caption. Most existing works focus on modeling the sentence interaction information, while the use of commonsense knowledge is not well studied for NLI. In this paper, we propose knowledge adaptive neural network (KANN) that adaptively incorporates commonsense knowledge at sentence encoding and inference stages. We first perform knowledge collection and representation to identify the relevant knowledge. Then we use a knowledge absorption gate to embed knowledge into neural network models. Experiments on two benchmark datasets, namely SNLI and MultiNLI for natural language inference, show the advantages of our proposed model. Furthermore, our model is comparable to if not better than the recent neural network based approaches on NLI.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#285 - Zhang 2023
ForensiQ: A Knowledge Graph Question Answering System for IoT Forensics

Zhang, R. P.; Xie, M. J.

14th EAI International Conference on Digital Forensics and Cyber Crime (ICDF2C) 2023;571():300-314

New York, NY Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-56583-0_20 · Ref ID: 3046

The increasing number of attacks against the Internet of Things (IoT) has made IoT forensics critically important for reporting and mitigating cyber incidents and crimes. However, the heterogeneity of IoT environments and the complexity and volume of IoT data present significant challenges to forensic practitioners. The advent of question answering (QA) systems and large language models (LLM) offers a potential solution to accessing sophisticated IoT forensic knowledge and data. In light of this, we propose ForensiQ, a framework based on knowledge graph question answering (KGQA), to help investigators navigate complex IoT forensic artifacts and cybersecurity knowledge. Our framework integrates knowledge graphs (KG) into the IoT forensic workflow to better organize and analyze forensic artifacts. We also have developed a novel KGQA model that serves as a natural-language user interface to the IoT forensic KG. Our evaluation results show that, compared to existing KGQA models, ForensiQ demonstrates higher accuracy in answering natural language questions when applied to our experimental IoT forensic KG.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1729 - Zhang 2024
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs

Zhang, R.; Shen, J.; Liu, T.; Wang, H.; Qin, Z.; Han, F.; Liu, J.; Baumgartner, S.; Bendersky, M.; Zhang, C.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():15623-15636

Association for Computational Linguistics (ACL) 2024

Ref ID: 4298

Large Language Models (LLMs) have exhibited impressive capabilities in various tasks, yet their vast parameter sizes restrict their applicability in resource-constrained settings. Knowledge distillation (KD) offers a viable solution by transferring expertise from large teacher models to compact student models. However, traditional KD techniques face specific challenges when applied to LLMs, including restricted access to LLM outputs, significant teacher-student capacity gaps, and the inherited mis-calibration issue. In this work, we present PLaD, a novel preference-based LLM distillation framework. PLaD exploits the teacher-student capacity discrepancy to generate pseudo-preference pairs where teacher outputs are preferred over student outputs. Then, PLaD leverages a ranking loss to re-calibrate student's estimation of sequence likelihood, which steers the student's focus towards understanding the relative quality of outputs instead of simply imitating the teacher. PLaD bypasses the need for access to teacher LLM's internal states, tackles the student's expressivity limitations, and mitigates the student mis-calibration issue. Through extensive experiments on two sequence generation tasks and with various LLMs, we demonstrate the effectiveness of our PLaD framework. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#43 - Zhang 2024
AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment Enabled by Large Language Models

Zhang, R.; Su, Y. X.; Trisedya, B. D.; Zhao, X. Y.; Yang, M.; Cheng, H.; Qi, J. Z.

IEEE Trans. Knowl. Data Eng. 2024;36(6):2357-2371

2024

DOI: 10.1109/tkde.2023.3325484 · Ref ID: 3164

The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1524 - Zhang 2023
Knowledge-Augmented Frame Semantic Parsing with Hybrid Prompt-Tuning

Zhang, R.; Sun, Y.; Yang, J.; Peng, W.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 2023;2023-June():

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/ICASSP49357.2023.10095476 · Ref ID: 5160

Frame semantics-based approaches have been widely used in semantic parsing tasks and have become mainstream. It remains challenging to disambiguate frame representations evoked by target lexical units under different contexts. Pre-trained Language Models (PLMs) have been used in semantic parsing and significantly improve the accuracy of neural parsers. However, the PLMs-based approaches tend to favor collocated patterns presented in the training data, leading to inaccurate outcomes. The intuition here is to design a mechanism to optimally use knowledge captured in semantic frames in conjunction with PLMs to disambiguate frames. We propose a novel Knowledge-Augmented Frame Semantic Parsing Architecture (KAF-SPA) to enhance semantic representation by incorporating accurate frame knowledge into PLMs during frame semantic parsing. Specifically, a Memory-based Knowledge Extraction Module (MKEM) is devised to select accurate frame knowledge and construct the continuous templates in the high dimensional vector space. Moreover, we design a Task-oriented Knowledge Probing Module (TKPM) using hybrid prompts (in terms of continuous and discrete prompts) to incorporate the selected knowledge into the PLMs and adapt PLMs to the tasks of frame and argument identification. Experimental results on two public FrameNet datasets demonstrate that our method significantly outperforms strong baselines (by more than +3% in F1), achieving state-of-art results on the current benchmark. Ablation studies verify the effectiveness of KAF-SPA. © 2023 IEEE.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3821 - Zhang 2022
REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

Zhang, Sheng; Ng, Patrick; Wang, Zhiguo; Xiang, Bing

arXiv 2022;():

2022

Ref ID: 7553

Relation extraction is an important but challenging task that aims to extract all hidden relational facts from the text. With the development of deep language models, relation extraction methods have achieved good performance on various benchmarks. However, we observe two shortcomings of previous methods: first, there is no unified framework that works well under various relation extraction settings; second, effectively utilizing external knowledge as background information is absent. In this work, we propose a knowledge-enhanced generative model to mitigate these two issues. Our generative model is a unified framework to sequentially generate relational triplets under various relation extraction settings and explicitly utilizes relevant knowledge from Knowledge Graph (KG) to resolve ambiguities. Our model achieves superior performance on multiple benchmarks and settings, including WebNLG, NYT10, and TACRED.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#336 - Zhang 2021
HORNET: Enriching Pre-trained Language Representations with Heterogeneous Knowledge Sources

Zhang, T. L.; Cai, Z. R.; Wang, C. Y.; Li, P.; Li, Y.; Qiu, M. H.; Tang, C. G.; He, X. F.; Huang, J.; Acm

30th ACM International Conference on Information and Knowledge Management (CIKM) 2021;():2608-2617

Univ Queensland, ELECTR NETWORK Assoc Computing Machinery 2021

DOI: 10.1145/3459637.3482436 · Ref ID: 3070

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilities of deep language models by leveraging the rich semantic knowledge from knowledge graphs, other than plain pre-training texts. However, previous efforts mostly use homogeneous knowledge (especially structured relation triples in knowledge graphs) to enhance the context-aware representations of entity mentions, whose performance may be limited by the coverage of knowledge graphs. Also, it is unclear whether these KEPLMs truly understand the injected semantic knowledge due to the "blackbox" training mechanism. In this paper, we propose a novel KEPLM named HORNET, which integrates Heterogeneous knOwledge from various structured and unstructured sources into the Roberta NETwork and hence takes full advantage of both linguistic and factual knowledge simultaneously. Specifically, we design a hybrid attention heterogeneous graph convolution network (HaHGCN) to learn heterogeneous knowledge representations based on the structured relation triplets from knowledge graphs and the unstructured entity description texts. Meanwhile, we propose the explicit dual knowledge understanding tasks to help induce a more effective infusion of the heterogeneous knowledge, promoting our model for learning the complicated mappings from the knowledge graph embedding space to the deep context-aware embedding space and vice versa. Experiments show that our HORNET model outperforms various KEPLM baselines on knowledge-aware tasks including knowledge probing, entity typing and relation extraction. Our model also achieves substantial improvement over several GLUE benchmark datasets, compared to other KEPLMs.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1578 - Zhang 2023
Learning Knowledge-Enhanced Contextual Language Representations for Domain Natural Language Understanding

Zhang, T.; Xu, R.; Wang, C.; Duan, Z.; Chen, C.; Qiu, M.; Cheng, D.; He, X.; Qian, W.

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():15663-15676

Association for Computational Linguistics (ACL) 2023

Ref ID: 4988

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the performance of various downstream NLP tasks by injecting knowledge facts from large-scale Knowledge Graphs (KGs). However, existing methods for pre-training KEPLMs with relational triples are difficult to be adapted to close domains due to the lack of sufficient domain graph semantics. In this paper, we propose a Knowledgeenhanced lANGuAge Representation learning framework for various clOsed dOmains (KANGAROO) via capturing the implicit graph structure among the entities. Specifically, since the entity coverage rates of closed-domain KGs can be relatively low and may exhibit the global sparsity phenomenon for knowledge injection, we consider not only the shallow relational representations of triples but also the hyperbolic embeddings of deep hierarchical entity-class structures for effective knowledge fusion. Moreover, as two closed-domain entities under the same entity-class often have locally dense neighbor subgraphs counted by max point biconnected component, we further propose a data augmentation strategy based on contrastive learning over subgraphs to construct hard negative samples of higher quality. It makes the underlying KELPMs better distinguish the semantics of these neighboring entities to further complement the global semantic sparsity. In the experiments, we evaluate KANGAROO over various knowledge-aware and general NLP tasks in both full and few-shot learning settings, outperforming various KEPLM training paradigms performance in closed-domains significantly.. ©2023 Association for Computational Linguistics.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#723 - Zhang 2022
Research on the Chinese Named-Entity-Relation-Extraction Method for Crop Diseases Based on BERT

Zhang, W. H.; Wang, C. S.; Wu, H. R.; Zhao, C. J.; Teng, G. F.; Huang, S. F.; Liu, Z.

Agronomy-Basel 2022;12(9):14

2022

DOI: 10.3390/agronomy12092130 · Ref ID: 3446

In order to integrate fragmented text data of crop disease knowledge to solve the current problems of disordered knowledge management, weak correlation and difficulty in knowledge sharing, a Chinese named-entity-relation-extraction model for crop diseases (BBCPF) was proposed in this paper by utilizing the advantage of knowledge graph in describing complex relations between disease entities in a structured form. This model was composed of two parts, i.e., named-entity recognition and relation extraction, in the form of an assembly line. To deal with the different meanings of Chinese crop disease terms in different contexts and to better obtain the contextual information, the BERT model was introduced for dynamic vector representations. Then, the BiLSTM layer was used to learn long-distance text information, and the CRF was applied to obtain the globally optimal labeling sequence, so as to output the crop disease entities. According to the entity category, the entities were divided as subjects and objects, which were then input into the disordered language model PERT to extract the contextual features of the relation data. At last, the fully connected layer was used to decode the information and output the crop disease entity-relation triples. The experiment results show that, on the self-built disease corpus dataset, the Precision, Recall, and Fl-Score values of the established model reached 85.63%, 79.46% and 82.43%, respectively, for entity extraction, and reached 97.96%, 98.43% and 98.16%, respectively, for relation extraction. This paper provides an effective method for information extraction in the construction of Chinese crop disease domain knowledge graphs.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#279 - Zhang 2024
Fine-tuning large language models for chemical text mining

Zhang, W.; Wang, Q. G.; Kong, X. T.; Xiong, J. C.; Ni, S. K.; Cao, D. H.; Niu, B. Y.; Chen, M. G.; Li, Y. M.; Zhang, R. Z.; Wang, Y. T.; Zhang, L. H.; Li, X. T.; Xiong, Z. P.; Shi, Q.; Huang, Z. M.; Fu, Z. Y.; Zheng, M. Y.

Chem. Sci. 2024;15(27):10600-10611

2024

DOI: 10.1039/d4sc00924j · Ref ID: 3598

Extracting knowledge from complex and diverse chemical texts is a pivotal task for both experimental and computational chemists. The task is still considered to be extremely challenging due to the complexity of the chemical language and scientific literature. This study explored the power of fine-tuned large language models (LLMs) on five intricate chemical text mining tasks: compound entity recognition, reaction role labelling, metal-organic framework (MOF) synthesis information extraction, nuclear magnetic resonance spectroscopy (NMR) data extraction, and the conversion of reaction paragraphs to action sequences. The fine-tuned LLMs demonstrated impressive performance, significantly reducing the need for repetitive and extensive prompt engineering experiments. For comparison, we guided ChatGPT (GPT-3.5-turbo) and GPT-4 with prompt engineering and fine-tuned GPT-3.5-turbo as well as other open-source LLMs such as Mistral, Llama3, Llama2, T5, and BART. The results showed that the fine-tuned ChatGPT models excelled in all tasks. They achieved exact accuracy levels ranging from 69% to 95% on these tasks with minimal annotated data. They even outperformed those task-adaptive pre-training and fine-tuning models that were based on a significantly larger amount of in-domain data. Notably, fine-tuned Mistral and Llama3 show competitive abilities. Given their versatility, robustness, and low-code capability, leveraging fine-tuned LLMs as flexible and effective toolkits for automated data acquisition could revolutionize chemical knowledge extraction. Extracting knowledge from complex chemical texts is essential for both experimental and computational chemists. Fine-tuned large language models (LLMs) can serve as flexible and effective extractors for automated data acquisition.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3946 - Zhang 2024
TrustUQA: A Trustful Framework for Unified Structured Data Question Answering

Zhang, Wen; Jin, Long; Zhu, Yushan; Chen, Jiaoyan; Huang, Zhiwei; Wang, Junjie; Hua, Yin; Liang, Lei; Chen, Huajun

arXiv 2024;():

2024

Ref ID: 8429

Natural language question answering (QA) over structured data sources such as tables and knowledge graphs (KGs) have been widely investigated, for example with Large Language Models (LLMs). The main solutions include question to formal query parsing and retrieval-based answer generation. However, current methods of the former often suffer from weak generalization, failing to dealing with multiple sources simultaneously, while the later is limited in trustfulness. In this paper, we propose UnifiedTQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way. To this end, it adopts an LLM-friendly and unified knowledge representation method called Condition Graph (CG), and uses an LLM and demonstration-based two-level method for CG querying. For enhancement, it is also equipped with dynamic demonstration retrieval. We have evaluated UnifiedTQA with 5 benchmarks covering 3 types of structured data. It outperforms 2 existing unified structured data QA methods and in comparison with the baselines that are specific to a data type, it achieves state-of-the-art on 2 of them. Further more, we demonstrates potential of our method for more general QA tasks, QA over mixed structured data and QA across structured data.

Ishan voted
brandon voted
Final decision
What was the agreed final decision?

#1857 - Zhang 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

Zhang, X.; Peng, B.; Tian, Y.; Zhou, J.; Jin, L.; Song, L.; Mi, H.; Meng, H.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():1946-1965

Association for Computational Linguistics (ACL) 2024

Ref ID: 4394

Despite showing impressive abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e., “hallucinations”, even when they hold relevant knowledge. To mitigate these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality. Specifically, we incorporate SELF-EVAL, a self-evaluation component, to prompt an LLM to validate the factuality of its own generated responses solely based on its internal knowledge. Additionally, we design Self-Knowledge Tuning (SK-TUNING) to augment the LLM's self-evaluation ability by improving the model's confidence estimation and calibration. We then utilize these self-annotated responses to fine-tune the model via Direct Preference Optimization algorithm. We show that the proposed self-alignment approach substantially enhances factual accuracy over LLAMA family models across three key knowledge-intensive tasks on TruthfulQA and BioGEN. © 2024 Association for Computational Linguistics.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#841 - Zhang 2024
Traditional Chinese Medicine Knowledge Graph Construction Based on Large Language Models

Zhang, Y. C.; Hao, Y. T.

Electronics 2024;13(7):21

2024

DOI: 10.3390/electronics13071395 · Ref ID: 2914

This study explores the use of large language models in constructing a knowledge graph for Traditional Chinese Medicine (TCM) to improve the representation, storage, and application of TCM knowledge. The knowledge graph, based on a graph structure, effectively organizes entities, attributes, and relationships within the TCM domain. By leveraging large language models, we collected and embedded substantial TCM-related data, generating precise representations transformed into a knowledge graph format. Experimental evaluations confirmed the accuracy and effectiveness of the constructed graph, extracting various entities and their relationships, providing a solid foundation for TCM learning, research, and application. The knowledge graph has significant potential in TCM, aiding in teaching, disease diagnosis, treatment decisions, and contributing to TCM modernization. In conclusion, this paper utilizes large language models to construct a knowledge graph for TCM, offering a vital foundation for knowledge representation and application in the field, with potential for future expansion and refinement.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#1544 - Zhang 2024
KnowVrDU: A Unified Knowledge-aware Prompt-Tuning Framework for Visually-rich Document Understanding

Zhang, Y.; Chen, Y.; Zhu, J.; Xu, J.; Yang, S.; Wu, Z.; Huang, L.; Huang, Y.; Chen, S.

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9878-9889

European Language Resources Association (ELRA) 2024

Ref ID: 4601

In Visually-rich Document Understanding (VrDU), recent advances of incorporating layout and image features into the pre-training language models have achieved significant progress. Existing methods usually developed complicated dedicated architectures based on pre-trained models and fine-tuned them with costly high-quality data to eliminate the inconsistency of knowledge distribution between the pre-training task and specialized downstream tasks. However, due to their huge data demands, these methods are not suitable for few-shot settings, which are essential for quick applications with limited resources but few previous works are presented. To solve these problems, we propose a unified Knowledge-aware prompt-tuning framework for Visual-rich Document Understanding (KnowVrDU) to enable broad utilization for diverse concrete applications and reduce data requirements. To model heterogeneous VrDU structures without designing task-specific architectures, we propose to reformulate various VrDU tasks into a single question-answering format with task-specific prompts and train the pre-trained model with the parameter-efficient prompt tuning method. To bridge the knowledge gap between the pre-training task and specialized VrDU tasks without additional annotations, we propose a prompt knowledge integration mechanism to leverage external open-source knowledge bases. We conduct experiments on several benchmark datasets in few-shot settings and the results validate the effectiveness of our method. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1602 - Zhang 2024
Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models

Zhang, Y.; He, Q.; Wang, X.; Yuan, S.; Liang, J.; Xiao, Y.

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():13379-13389

Association for Computational Linguistics (ACL) 2024

Ref ID: 4291

Multi-Modal Knowledge Graphs (MMKGs) have proven valuable for various downstream tasks. However, scaling them up is challenging because building large-scale MMKGs often introduces mismatched images (i.e., noise). Most entities in KGs belong to the long tail, meaning there are few images of them available online. This scarcity makes it difficult to determine whether a found image matches the entity. To address this, we draw on the Triangle of Reference Theory and suggest enhancing vision-language models with concept guidance. Specifically, we introduce COG, a two-stage framework with COncept-Guided vision-language models. The framework comprises a CONCEPT INTEGRATION module, which effectively identifies image-text pairs of long-tailed entities, and an EVIDENCE FUSION module, which offers explainability and enables human verification. To demonstrate the effectiveness of COG, we create a dataset of 25k image-text pairs of long-tailed entities. Our comprehensive experiments show that COG not only improves the accuracy of recognizing long-tailed image-text pairs compared to baselines but also offers flexibility and explainability. © 2024 Association for Computational Linguistics.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3147 - Zhang 2024
Making Large Language Models Perform Better in Knowledge Graph Completion

Zhang, Yichi; Chen, Zhuo; Guo, Lingbing; Xu, Yajing; Zhang, Wen; Chen, Huajun

Proceedings of the 32nd ACM International Conference on Multimedia 2024;():233–242

Melbourne VIC, Australia Association for Computing Machinery 2024

DOI: 10.1145/3664647.3681327 · Ref ID: 7121

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3255 - Zhang 2024
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs

Zhang, Yifei; Wang, Xintao; Liang, Jiaqing; Xia, Sirui; Chen, Lida; Xiao, Yanghua

arXiv 2024;():

2024

Ref ID: 8436

Large Language Models (LLMs) have exhibited impressive proficiency in various natural language processing (NLP) tasks, which involve increasingly complex reasoning. Knowledge reasoning, a primary type of reasoning, aims at deriving new knowledge from existing one.While it has been widely studied in the context of knowledge graphs (KGs), knowledge reasoning in LLMs remains underexplored. In this paper, we introduce Chain-of-Knowledge, a comprehensive framework for knowledge reasoning, including methodologies for both dataset construction and model learning. For dataset construction, we create KnowReason via rule mining on KGs. For model learning, we observe rule overfitting induced by naive training. Hence, we enhance CoK with a trial-and-error mechanism that simulates the human process of internal knowledge exploration. We conduct extensive experiments with KnowReason. Our results show the effectiveness of CoK in refining LLMs in not only knowledge reasoning, but also general reasoning benchmarkms.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3197 - Zhang 2024
AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Zhang, Yongheng; Du, Tingwen; Ma, Yunshan; Wang, Xiang; Xie, Yi; Yang, Guozheng; Lu, Yuliang; Chang, Ee-Chien

arXiv 2024;():

2024

Ref ID: 8282

Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#2071 - Zhang 2023
Large-Scale Biomedical Relation Extraction Across Diverse Relation Types: Model Development and Usability Study on COVID-19

Zhang, Z.; Fang, M.; Wu, R.; Zong, H.; Huang, H.; Tong, Y.; Xie, Y.; Cheng, S.; Wei, Z.; Crabbe, M. J. C.; Zhang, X.; Wang, Y.

J Med Internet Res 2023;25():e48115

2023

DOI: 10.2196/48115 · Ref ID: 5893

BACKGROUND: Biomedical relation extraction (RE) is of great importance for researchers to conduct systematic biomedical studies. It not only helps knowledge mining, such as knowledge graphs and novel knowledge discovery, but also promotes translational applications, such as clinical diagnosis, decision-making, and precision medicine. However, the relations between biomedical entities are complex and diverse, and comprehensive biomedical RE is not yet well established. OBJECTIVE: We aimed to investigate and improve large-scale RE with diverse relation types and conduct usability studies with application scenarios to optimize biomedical text mining. METHODS: Data sets containing 125 relation types with different entity semantic levels were constructed to evaluate the impact of entity semantic information on RE, and performance analysis was conducted on different model architectures and domain models. This study also proposed a continued pretraining strategy and integrated models with scripts into a tool. Furthermore, this study applied RE to the COVID-19 corpus with article topics and application scenarios of clinical interest to assess and demonstrate its biological interpretability and usability. RESULTS: The performance analysis revealed that RE achieves the best performance when the detailed semantic type is provided. For a single model, PubMedBERT with continued pretraining performed the best, with an F1-score of 0.8998. Usability studies on COVID-19 demonstrated the interpretability and usability of RE, and a relation graph database was constructed, which was used to reveal existing and novel drug paths with edge explanations. The models (including pretrained and fine-tuned models), integrated tool (Docker), and generated data (including the COVID-19 relation graph database and drug paths) have been made publicly available to the biomedical text mining community and clinical researchers. CONCLUSIONS: This study provided a comprehensive analysis of RE with diverse relation types. Optimized RE models and tools for diverse relation types were developed, which can be widely used in biomedical text mining. Our usability studies provided a proof-of-concept demonstration of how large-scale RE can be leveraged to facilitate novel research.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#1744 - Zhang 2023
Predicting Dynamic Relationship for Financial Knowledge Graph

Zhang, Z.; Ni, Z.; Liu, Z.; Xia, S.

Data. Anal. Knowl. Discov. 2023;7(9):39-50

2023

DOI: 10.11925/infotech.2096-3467.2022.0921 · Ref ID: 4885

[Objective] This paper proposes a data-driven prediction method for dynamic relationships, aiming to provide a new perspective for rapidly updating the financial knowledge graph. [Methods] First, we regularly crawled relevant information from the Internet according to the monitoring list. Then, we used the Mask Language Model to construct a dataset and train the model. Third, we extracted the hierarchical structure of the financial knowledge graph to build a hidden layer of the neural network. The neurons contained in the hidden layer represent named entities. Fourth, we connected the hidden layers by a relationship matrix and predicted the dynamic relationships by updating the connection matrix. [Results] We examined the proposed model with the two equity changes at the beginning of the“Baowan”event. Our new model quickly captured the changes in the relationship between corresponding entities of the financial graph in different periods. [Limitations] Due to the characteristics of unsupervised learning, the predicted relationship is relatively divergent, which requires manual calibration verification. [Conclusions] With sufficient data, the proposed method can effectively capture the changes in the relationship between entities without manual annotation. It will effectively and continuously predict the relationship of the financial knowledge graph. © 2023 The Author(s).

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#587 - Zhang 2021
Multi-Turn Dialogue Reading Comprehension With Pivot Turns and Knowledge

Zhang, Z. S.; Li, J. L.; Zhao, H.

IEEE-ACM Trans. Audio Speech Lang. 2021;29():1161-1173

2021

DOI: 10.1109/taslp.2021.3058616 · Ref ID: 3507

Multi-turn dialogue reading comprehension aims to teach machines to read dialogue contexts and solve tasks such as response selection and answering questions. The major challenges involve noisy history contexts and especial prerequisites of commonsense knowledge that is unseen in the given material. Existing works mainly focus on context and response matching approaches. This work thus makes the first attempt to tackle the above two challenges by extracting substantially important turns as pivot utterances and utilizing external knowledge to enhance the representation of context. We propose a pivot-oriented deep selection model (PoDS) on top of the Transformer-based language models for dialogue comprehension. In detail, our model first picks out the pivot utterances from the conversation history according to the semantic matching with the candidate response or question, if any. Besides, knowledge items related to the dialogue context are extracted from a knowledge graph as external knowledge. Then, the pivot utterances and the external knowledge are combined together with a well-designed mechanism for refining predictions. Experimental results on four dialogue comprehension benchmark tasks show that our proposed model achieves great improvements on baselines. A series of empirical comparisons are conducted to show how our selection strategies and the extra knowledge injection influence the results.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#237 - Zhang 2019
ERNIE: Enhanced Language Representation with Informative Entities

Zhang, Z. Y.; Han, X.; Liu, Z. Y.; Jiang, X.; Sun, M. S.; Liu, Q.; Acl

57th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2019;():1441-1451

Florence, ITALY Assoc Computational Linguistics-Acl 2019

Ref ID: 3419

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code and experiment details of this paper can be obtained from https://github.com/thunlp/ERNIE.

Srividya voted
Xinchen voted
Final decision
What was the agreed final decision?

#666 - Zhang 2020
Pretrain-KGE: Learning Knowledge Representation from Pretrained Language Models

Zhang, Z. Y.; Liu, X. Q.; Zhang, Y.; Su, Q.; Sun, X.; He, B.

Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2020;():259-266

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3097

Conventional knowledge graph embedding (KGE) often suffers from limited knowledge representation, leading to performance degradation especially on the low-resource problem. To remedy this, we propose to enrich knowledge representation via pretrained language models by leveraging world knowledge from pretrained models. Specifically, we present a universal training framework named Pretrain-KGE consisting of three phases: semantic-based fine-tuning phase, knowledge extracting phase and KGE training phase. Extensive experiments show that our proposed Pretrain-KGE can improve results over KGE models, especially on solving the low-resource problem.

Davis voted
Ishan voted
Final decision
What was the agreed final decision?

#3289 - Zhang 2024
Contrastive Learning for Knowledge-Based Question Generation in Large Language Models

Zhang, Zhenhong; Chen, Jiajing; Shi, Weiyan; Yi, Lingjie; Wang, Chihang; Yu, Qian

arXiv 2024;():

2024

Ref ID: 8617

With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#3134 - Zhang 2024
A GAIL Fine-Tuned LLM Enhanced Framework for Low-Resource Knowledge Graph Question Answering

Zhang, Zhiqiang; Wen, Liqiang; Zhao, Wen

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3300–3309

Boise, ID, USA Association for Computing Machinery 2024

DOI: 10.1145/3627673.3679753 · Ref ID: 7116

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3695 - Zhang 2023
Make Them Spill the Beans! Coercive Knowledge Extraction from (Production) LLMs

Zhang, Zhuo; Shen, Guangyu; Tao, Guanhong; Cheng, Siyuan; Zhang, Xiangyu

arXiv 2023;():

2023

Ref ID: 7976

Large Language Models (LLMs) are now widely used in various applications, making it crucial to align their ethical standards with human values. However, recent jail-breaking methods demonstrate that this alignment can be undermined using carefully constructed prompts. In our study, we reveal a new threat to LLM alignment when a bad actor has access to the model's output logits, a common feature in both open-source LLMs and many commercial LLM APIs (e.g., certain GPT models). It does not rely on crafting specific prompts. Instead, it exploits the fact that even when an LLM rejects a toxic request, a harmful response often hides deep in the output logits. By forcefully selecting lower-ranked output tokens during the auto-regressive generation process at a few critical output positions, we can compel the model to reveal these hidden responses. We term this process model interrogation. This approach differs from and outperforms jail-breaking methods, achieving 92% effectiveness compared to 62%, and is 10 to 20 times faster. The harmful content uncovered through our method is more relevant, complete, and clear. Additionally, it can complement jail-breaking strategies, with which results in further boosting attack performance. Our findings indicate that interrogation can extract toxic knowledge even from models specifically designed for coding tasks.

Davis voted
yuexi voted
Final decision
What was the agreed final decision?

#1072 - Zhao 2021
A Chinese Machine Reading Comprehension Dataset Automatic Generated Based on Knowledge Graph

Zhao, H.; Yuan, S.; Leng, J.; Pan, X.; Xue, Z.; Ma, Q.; Liang, Y.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2021;12869 LNAI():268-279

Springer Science and Business Media Deutschland GmbH 2021

DOI: 10.1007/978-3-030-84186-7_18 · Ref ID: 5607

Machine reading comprehension (MRC) is a typical natural language processing (NLP) task and has developed rapidly in the last few years. Various reading comprehension datasets have been built to support MRC studies. However, large-scale and high-quality datasets are rare due to the high complexity and huge workforce cost of making such a dataset. Besides, most reading comprehension datasets are in English, and Chinese datasets are insufficient. In this paper, we propose an automatic method for MRC dataset generation, and build the largest Chinese medical reading comprehension dataset presently named CMedRC. Our dataset contains 17k questions generated by our automatic method and some seed questions. We obtain the corresponding answers from a medical knowledge graph and manually check all of them. Finally, we test BiLSTM and BERT-based pre-trained language models (PLMs) on our dataset and propose a baseline for the following studies. Results show that the automatic MRC dataset generation method is considerable for future model improvements. © 2021, Springer Nature Switzerland AG.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#3243 - Zhao 2021
Calculating Question Similarity is Enough: A New Method for KBQA Tasks

Zhao, Hanyu; Yuan, Sha; Leng, Jiahong; Pan, Xiang; Wang, Guoqiang; Wu, Ledell; Tang, Jie

arXiv 2021;():

2021

Ref ID: 7496

Knowledge Base Question Answering (KBQA) aims to answer natural language questions with the help of an external knowledge base. The core idea is to find the link between the internal knowledge behind questions and known triples of the knowledge base. Traditional KBQA task pipelines contain several steps, including entity recognition, entity linking, answering selection, etc. In this kind of pipeline methods, errors in any procedure will inevitably propagate to the final prediction. To address this challenge, this paper proposes a Corpus Generation - Retrieve Method (CGRM) with Pre-training Language Model (PLM) for the KBQA task. The major novelty lies in the design of the new method, wherein our approach, the knowledge enhanced T5 (kT5) model aims to generate natural language QA pairs based on Knowledge Graph triples and directly solve the QA by retrieving the synthetic dataset. The new method can extract more information about the entities from PLM to improve accuracy and simplify the processes. We test our method on NLPCC-ICCPOL 2016 KBQA dataset, and the results show that our method improves the performance of KBQA and the out straight-forward method is competitive with the state-of-the-art.

Mike voted
Kwesi voted
Final decision
What was the agreed final decision?

#1737 - Zhao 2024
Power Large Language Model Exploration: Activation, Measurement and Enhancement for Operations and Maintenance Knowledge: Activation, Measurement and Enhancement for Power O&M Knowledge

Zhao, J.; Ma, Z.; Zhao, H.; Zhang, X.; Liu, Q.; Peng, X.; Zhang, G.

ACM International Conference Proceeding Series 2024;():1-7

Association for Computing Machinery 2024

DOI: 10.1145/3689218.3689222 · Ref ID: 3847

With the rapid advancement of Large Language Models, their applications are gradually transitioning from general to specific domains. However, the application of LLM in the electric power domain is still in its early stages, and few studies have explored power LLM. Currently, there are two main challenges against power LLMs: (1) determining how to measure the real power knowledge capacity of LLMs to facilitate targeted enhancement of specific knowledge. (2) identifying practical enhancement methods to facilitate efficient and feasible power LLM applications in real-world scenarios. In this paper, we ask three insightful questions that address the power knowledge capacity of LLMs and then draw inspiration from Reflexion and CoT to design an Activation, Measurement and Enhancement framework (AME) for power operations and maintenance (O&M) knowledge. Specifically, we ask three "HOW"questions based on the activation, measurement, and enhancement of power O&M knowledge. We introduce a Reflexion Module to discover the knowledge capacity of LLM and a Knowledge Graph Module to provide external knowledge of LLM in our proposed AME. Experiments on the real-world dataset provide strong evidence when we answer the above three insightful questions. © 2024 ACM.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#746 - Zhao 2024
Self-consistency, Extract and Rectify: Knowledge Graph Enhance Large Language Model for Electric Power Question Answering

Zhao, J. X.; Ma, Z. C.; Zhao, H.; Zhang, X.; Liu, Q. C.; Zhang, C. T.

20th International Conference on Intelligent Computing (ICIC) 2024;14873():493-504

Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024

DOI: 10.1007/978-981-97-5615-5_40 · Ref ID: 3172

Electric power artificial intelligence has rapidly advanced in recent years, encompassing safety detection, assistant decision-making, and optimal scheduling. With the rise of Large Language Models (LLMs), knowledge-based AI is becoming increasingly prevalent across various domains. However, in the field of electric power, most of the knowledge-based AI is centered on Knowledge Graph (KG) techniques, while less research has been done on power LLMs. In this paper, we are inspired by Self-Consistency (SC) and propose a Self-Consistency, Extraction and Rectify framework-SCER, for the usage of KG-enhanced LLM in power operations and maintenance (O&M) question answering scenarios. Specifically, we transfer the SC from the general-purpose domain into the power domain and replace the original model with a Chinese sentence representation model to make it more localized. We design an Extract Mechanism to generate evidence chains through multiple random walks on the POMKG and a Rectify Mechanism to correct the score of the generated rationales. Extensive experiments and specific case studies on the POMQA dataset demonstrate the effectiveness of our proposed SCER for SC transfer and improvement in the power field.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#3246 - Zhao 2024
Can Language Model Understand Word Semantics as A Chatbot? An Empirical Study of Language Model Internal External Mismatch

Zhao, Jinman; Zhang, Xueyan; Yue, Xingyu; Chen, Weizhe; Qian, Zifan; Wang, Ruiyu

arXiv 2024;():

2024

Ref ID: 8616

Current common interactions with language models is through full inference. This approach may not necessarily align with the model's internal knowledge. Studies show discrepancies between prompts and internal representations. Most focus on sentence understanding. We study the discrepancy of word semantics understanding in internal and external mismatch across Encoder-only, Decoder-only, and Encoder-Decoder pre-trained language models.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#2268 - Zhao 2014
A concept-based knowledge representation model for semantic entailment inference

Zhao, M.; Ni, W.; Zhang, H.; Yang, Y.

Proceedings of the 33rd Chinese Control Conference 2014;():522-527

2014

DOI: 10.1109/ChiCC.2014.6896678 · Ref ID: 6050

Semantic entailment is a fundamental problem in natural language understanding field which has a large number of applications. Knowledge acquisition and knowledge representation are crucial parts in semantic inference strategies. This paper presents a principled approach to semantic entailment problem that builds on a concept-based knowledge representation model (CKR). This model formally defines the concept as a triple (attribute, relation and behavior) and the knowledge of a concept can be illustrated by the triple. We propose a semantic inference strategy that against identify text segments which with dissimilar surface form but share a common meaning. The inference strategy avoids syntactic analysis steps. A preliminary evaluation on the PASCAL text collection is presented. Experimental results show that our concept-based inference strategy is effective and has strong development potential.

Mike voted
Ishan voted
Final decision
What was the agreed final decision?

#3113 - Zhao 2024
Breaking the Barrier: Utilizing Large Language Models for Industrial Recommendation Systems through an Inferential Knowledge Graph

Zhao, Qian; Qian, Hao; Liu, Ziqi; Zhang, Gong-Duo; Gu, Lihong

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():5086–5093

Boise, ID, USA Association for Computing Machinery 2024

DOI: 10.1145/3627673.3680022 · Ref ID: 7138

Srividya voted
brandon voted
Final decision
What was the agreed final decision?

#320 - Zhao 2024
Graph Reasoning Transformers for Knowledge -Aware Question Answering

Zhao, R. L.; Zhao, F.; Hu, L.; Xu, G. D.

38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():19652-19660

Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024

Ref ID: 3470

Augmenting Language Models (LMs) with structured knowledge graphs (KGs) aims to leverage structured world knowledge to enhance the capability of LMs to complete knowledge -intensive tasks. However, existing methods are unable to effectively utilize the structured knowledge in a KG due to their inability to capture the rich relational semantics of knowledge triplets. Moreover, the modality gap between natural language text and KGs has become a challenging obstacle when aligning and fusing cross -modal information. To address these challenges, we propose a novel knowledge augmented question answering (QA) model, namely, Graph Reasoning Transformers (GRT). Different from conventional node-level methods, the GRT serves knowledge triplets as atomic knowledge and utilize a triplet-level graph encoder to capture triplet-level graph features. Furthermore, to alleviate the negative effect of the modality gap on joint reasoning, we propose a representation alignment pretraining to align the cross -modal representations and introduce a cross -modal information fusion module with attention bias to enable cross modal information fusion. Extensive experiments conducted on three knowledge-intensive QA benchmarks show that the GRT outperforms the state-of-the-art KG-augmented QA systems, demonstrating the effectiveness and adaptation of our proposed model.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3170 - Zhao 2024
Zero-shot Knowledge Graph Question Generation via Multi-agent LLMs and Small Models Synthesis

Zhao, Runhao; Tang, Jiuyang; Zeng, Weixin; Chen, Ziyang; Zhao, Xiang

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3341–3351

Boise, ID, USA Association for Computing Machinery 2024

DOI: 10.1145/3627673.3679805 · Ref ID: 7114

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3180 - Zhao 2024
AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data

Zhao, Xinjie; Blum, Moritz; Yang, Rui; Yang, Boming; Carpintero, Luis Márquez; Pina-Navarro, Mónica; Wang, Tony; Li, Xin; Li, Huitao; Fu, Yanran; Wang, Rongrong; Zhang, Juntao; Li, Irene

arXiv 2024;():

2024

Ref ID: 8710

Large Language Models (LLMs) have demonstrated capabilities across various applications but face challenges such as hallucination, limited reasoning abilities, and factual inconsistencies, especially when tackling complex, domain-specific tasks like question answering (QA). While Knowledge Graphs (KGs) have been shown to help mitigate these issues, research on the integration of LLMs with background KGs remains limited. In particular, user accessibility and the flexibility of the underlying KG have not been thoroughly explored. We introduce AGENTiGraph (Adaptive Generative ENgine for Task-based Interaction and Graphical Representation), a platform for knowledge management through natural language interaction. It integrates knowledge extraction, integration, and real-time visualization. AGENTiGraph employs a multi-agent architecture to dynamically interpret user intents, manage tasks, and integrate new knowledge, ensuring adaptability to evolving user requirements and data contexts. Our approach demonstrates superior performance in knowledge graph interactions, particularly for complex domain-specific tasks. Experimental results on a dataset of 3,500 test cases show AGENTiGraph significantly outperforms state-of-the-art zero-shot baselines, achieving 95.12% accuracy in task classification and 90.45% success rate in task execution. User studies corroborate its effectiveness in real-world scenarios. To showcase versatility, we extended AGENTiGraph to legislation and healthcare domains, constructing specialized KGs capable of answering complex queries in legal and medical contexts.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1510 - Zheng 2024
A KNOWLEDGE GRAPH MODELING APPROACH FOR AUGMENTING LANGUAGE MODEL-BASED CONTRACT RISK IDENTIFICATION

Zheng, C.; Tang, Y.; Su, X.

Proceedings of the European Conference on Computing in Construction 2024;2024():260-267

European Council on Computing in Construction (EC3) 2024

DOI: 10.35490/EC3.2024.178 · Ref ID: 4393

Contract risk identification is essential for preventing disputes and losses in construction industry. Large language models (LLMs) have impacted various natural language processing tasks, offering a promising avenue for automating contract review without extensive data processing and feature engineering. However, LLMs still has difficulty in recalling facts while generating knowledge-grounded analysis, especially when related to complex domain knowledge. This paper introduces a Knowledge Graph (KG) modeling approach to enhance the LLM-based automated contract risk identification. A case study demonstrates that our approach exhibits enhanced performance on risk identification tasks compared to non-augmentation scenario. © 2024 European Council on Computing in Construction.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3572 - Zheng 2023
KGLens: Towards Efficient and Effective Knowledge Probing of Large Language Models with Knowledge Graphs

Zheng, Shangshang; Bai, He; Zhang, Yizhe; Su, Yi; Niu, Xiaochuan; Jaitly, Navdeep

arXiv 2023;():

2023

Ref ID: 7997

Large Language Models (LLMs) might hallucinate facts, while curated Knowledge Graph (KGs) are typically factually reliable especially with domain-specific knowledge. Measuring the alignment between KGs and LLMs can effectively probe the factualness and identify the knowledge blind spots of LLMs. However, verifying the LLMs over extensive KGs can be expensive. In this paper, we present KGLens, a Thompson-sampling-inspired framework aimed at effectively and efficiently measuring the alignment between KGs and LLMs. KGLens features a graph-guided question generator for converting KGs into natural language, along with a carefully designed importance sampling strategy based on parameterized KG structure to expedite KG traversal. Our simulation experiment compares the brute force method with KGLens under six different sampling methods, demonstrating that our approach achieves superior probing efficiency. Leveraging KGLens, we conducted in-depth analyses of the factual accuracy of ten LLMs across three large domain-specific KGs from Wikidata, composing over 19K edges, 700 relations, and 21K entities. Human evaluation results indicate that KGLens can assess LLMs with a level of accuracy nearly equivalent to that of human annotators, achieving 95.7% of the accuracy rate.

yuexi voted
Davis voted
Final decision
What was the agreed final decision?

#3262 - Zheng 2024
CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

Zheng, Tianshi; Bai, Jiaxin; Wang, Yicheng; Fang, Tianqing; Guo, Yue; Yim, Yauwai; Song, Yangqiu

arXiv 2024;():

2024

Ref ID: 8495

While large language models (LLMs) have demonstrated impressive capabilities across various natural language processing tasks by acquiring rich factual knowledge from their broad training data, their ability to synthesize and logically reason with this knowledge in complex ways remains underexplored. In this work, we present a systematic evaluation of state-of-the-art LLMs' complex logical reasoning abilities through a novel benchmark of automatically generated complex reasoning questions over general domain and biomedical knowledge graphs. Our extensive experiments, employing diverse in-context learning techniques, reveal that LLMs excel at reasoning over general world knowledge but face significant challenges with specialized domain-specific knowledge. We find that prompting with explicit Chain-of-Thought demonstrations can substantially improve LLM performance on complex logical reasoning tasks with diverse logical operations. Interestingly, our controlled evaluations uncover an asymmetry where LLMs display proficiency at set union operations, but struggle considerably with set intersections - a key building block of logical reasoning. To foster further work, we will publicly release our evaluation benchmark and code.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#3613 - Zheng 2024
A Knowledge-Enhanced Disease Diagnosis Method Based on Prompt Learning and BERT Integration

Zheng, Zhang

arXiv 2024;():

2024

Ref ID: 8603

This paper proposes a knowledge-enhanced disease diagnosis method based on a prompt learning framework. The method retrieves structured knowledge from external knowledge graphs related to clinical cases, encodes it, and injects it into the prompt templates to enhance the language model's understanding and reasoning capabilities for the task.We conducted experiments on three public datasets: CHIP-CTC, IMCS-V2-NER, and KUAKE-QTR. The results show that the proposed method significantly outperforms existing models across multiple evaluation metrics, with an F1 score improvement of 2.4% on the CHIP-CTC dataset, 3.1% on the IMCS-V2-NER dataset,and 4.2% on the KUAKE-QTR dataset. Additionally,ablation studies confirmed the critical role of the knowledge injection module,as the removal of this module resulted in a significant drop in F1 score. The experimental results demonstrate that the proposed method not only effectively improves the accuracy of disease diagnosis but also enhances the interpretability of the predictions, providing more reliable support and evidence for clinical diagnosis.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#887 - Zhengbao 2020
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

Zhengbao, J.; Anastasopoulos, A.; Jun, A.; Haibo, D.; Neubig, G.; Assoc Computat, Linguist

Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():5943-5959

Electr Network Assoc Computational Linguistics-Acl 2020

Ref ID: 3649

Language models (LMs) have proven surprisingly successful at capturing factual knowledge by completing cloze-style fill-in-theblank questions such as "Punta Cana is located in _." However, while knowledge is both written and queried in many languages, studies on LMs' factual representation ability have almost invariably been performed on English. To assess factual knowledge retrieval in LMs in different languages, we create a multilingual benchmark of cloze-style probes for 23 typologically diverse languages. To properly handle language variations, we expand probing methods from single- to multi-word entities, and develop several decoding algorithms to generate multi-token predictions. Extensive experimental results provide insights about how well (or poorly) current state-of-theart LMs perform at this task in languages with more or fewer available resources. We further propose a code-switching-based method to improve the ability of multilingual LMs to access knowledge, and verify its effectiveness on several benchmark languages. Benchmark data and code have be released at https: //x-factr.github.io.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1840 - Zhong 2023
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering

Zhong, V.; Shi, W.; Yih, S. W. T.; Zettlemoyer, L.

Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():7055-7067

Association for Computational Linguistics (ACL) 2023

Ref ID: 5049

We introduce RoMQA, the first benchmark for robust, multi-evidence, multi-answer question answering (QA). RoMQA contains clusters of questions that are derived from related constraints mined from the Wikidata knowledge graph. RoMQA evaluates robustness of QA models to varying constraints by measuring worst-case performance within each question cluster. Compared to prior QA datasets, RoMQA has more human-written questions that require reasoning over more evidence text and have, on average, many more correct answers. In addition, human annotators rate RoMQA questions as more natural or likely to be asked by people. We evaluate state-of-the-art large language models in zero-shot, few-shot, and fine-tuning settings, and find that RoMQA is challenging: zero-shot and few-shot models perform similarly to naive baselines, while supervised retrieval methods perform well below gold evidence upper bounds. Moreover, existing models are not robust to variations in question constraints, but can be made more robust by tuning on clusters of related questions. Our results show that RoMQA is a challenging benchmark for large language models, and provides a quantifiable test to build more robust QA methods. © 2023 Association for Computational Linguistics.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#759 - Zhong 2022
Semantics Driven Embedding Learning for Effective Entity Alignment

Zhong, Z. Y.; Zhang, M. H.; Fan, J.; Dou, C. X.; Soc, Ieee Comp

38th IEEE International Conference on Data Engineering (ICDE) 2022;():2127-2140

Electr Network Ieee Computer Soc 2022

DOI: 10.1109/icde53745.2022.00205 · Ref ID: 3568

Knowledge-based data service has become an emerging form of service in the world wide web (WWW). To ensure the service quality, a comprehensive knowledge base has to be constructed. Knowledge base integration is often a primary way to improve the completeness. In this paper, we focus on the fundamental problem in knowledge base integration, i.e., entity alignment (EA). EA has been studied for years. Traditional approaches focus on the symbolic features of entities and propose various similarity measures to identify equivalent entities. With recent development in knowledge graph representation learning, embedding-based entity alignment has emerged, which encodes the entities into vectors according to the semantic or structural information and computes the relatedness of entities based on the vector representation. While embedding-based approaches achieve promising results, we identify some important information that are not well exploited in existing works: 1) The neighboring entities contribute differently in the EA process, and should be carefully assigned the importance in learning the relatedness of entities; 2) The attribute values (especially the long texts) contain rich semantics that can build supplementary associations between entities. To this end, we propose SDEA - a Semantics Driven entity embedding method for Entity Alignment. SDEA consists of two modules, namely attribute embedding and relation embedding. The attribute embedding captures the semantic information from attribute values with a pre-trained transformer-based language model. The relation embedding selectively aggregates the semantic information from neighbors using a GRU model equipped with an attention mechanism. Both attribute embedding and relation embedding are driven by semantics, building bridges between entities. Experimental results show that our method significantly outperforms the state-of-the-art approaches on three benchmarks.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#525 - Zhou 2022
Leveraging on causal knowledge for enhancing the root cause analysis of equipment spot inspection failures

Zhou, B.; Li, J.; Li, X. Y.; Hua, B.; Bao, J. S.

Adv. Eng. Inform. 2022;54():10

2022

DOI: 10.1016/j.aei.2022.101799 · Ref ID: 3380

Causal correlation data over the equipment spot-inspection operation and maintenance (O&M) records and fault investigation sheets potentially reflect the state related to the causal effect of equipment failures. Various factors influence equipment failures, making it difficult to effectively analyze the main cause of the problems. Mining and leveraging these causal data from the equipment spot inspection records will undoubtedly significantly improve the root cause analysis of the fault in the O&M system. Hence, this paper introduces causal knowledge in equipment fault O&M for the first time and proposes to exploit causal knowledge for enhancing root cause analysis of equipment spot inspection failures. Specifically, an equipment fault O&M knowledge graph with causal knowledge called CausalKG is constructed to provide knowledge support for the causal analysis of faults. That is, CausalKG consists of spot-inspection knowledge graph (SIKG) and causal relationship knowledge (CRK) in equipment fault O&M. Further, a CausalKG-ALBERT knowledge reasoning model is designed. The model transforms CausalKG into network embeddings based on relational graph convolutional networks. In turn, it combines the Q&A mechanism of the language model ALBERT to mine the root cause knowledge of equipment failures. The case study confirms that incorporating the CRK is more effective than directly using the SIKG for causality reasoning; The model can fully use causal relationship knowledge to enhance the reliability of root cause analysis. This method is valuable to help engineers strengthen their causal analysis capabilities in pre-ventive equipment maintenance.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#92 - Zhou 2024
CausalKGPT: Industrial structure causal knowledge-enhanced large language model for cause analysis of quality problems in aerospace product manufacturing

Zhou, B.; Li, X. Y.; Liu, T. Y.; Xu, K. Z.; Liu, W.; Bao, J. S.

Adv. Eng. Inform. 2024;59():16

2024

DOI: 10.1016/j.aei.2023.102333 · Ref ID: 3302

The whole cycle for manufacturing aerospace thin-walled shells is a lengthy and sophisticated process. A large amount of quality-related data exists within and between processes, involving many types of quality defects and influencing factors. However, there are ambiguous causal associations among quality-related data affecting the shape-properties of the shell. Also, the coupling of long processes and multiple factors makes it hard to analyze the main factors that affect the quality defects in shell manufacturing. In this paper, taking into account the advantages of causal Scientology and the large language model (LLM), we propose an industrial structure causal knowledge-enhanced large language model for the cause analysis of quality defects in aerospace product manufacturing. To reinforce the causal associations among quality-related data deriving from manufacturing documents (product defect survey sheets, quality inspection, and maintenance reports), a structure causal graphbased sum-product network (SCG-SPN) model is designed to model machining quality-related knowledge and eliminate pseudo-association confounding factors by doing an intervention. Thus, a causal quality-related knowledge graph (CQKG) with high-quality causal associations is constructed. With this, to provide a trustworthy guarantee in responding to quality problem solving, we construct a quality-related prompt dataset with multi-round conversations based on CQKG. Then, a novel P-tuning that adapts to utilize external CQKG instructions is designed to fine-tune an open-source ChatGLM base model. Based on this, a causal knowledge graphaugmented LLM, named CausalKGPT, is developed to enable reasoning and responding to quality defects in both Chinese and English. It uses natural text descriptions related to quality defects as input and takes a qualityrelated causal knowledge graph as an additional corpus. Finally, the case study shows that the CausalKGPT performs with more expertise and reliability in responding to quality question solving of aerospace shell manufacturing than the classic commercial models like ChatGPT and GPT4. The results indicate that the proposed method may provide a trustworthy guide in assisting workers to analyze quality defects in aerospace products.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1774 - Zhou 2023
PROTEIN REPRESENTATION LEARNING VIA KNOWLEDGE ENHANCED PRIMARY STRUCTURE MODELING

Zhou, H. Y.; Fu, Y.; Zhang, Z.; Bian, C.; Yu, Y.

11th International Conference on Learning Representations, ICLR 2023 2023;():

International Conference on Learning Representations, ICLR 2023

Ref ID: 4991

Protein representation learning has primarily benefited from the remarkable development of language models (LMs). Accordingly, pre-trained protein models also suffer from a problem in LMs: a lack of factual knowledge. The recent solution models the relationships between protein and associated knowledge terms as the knowledge encoding objective. However, it fails to explore the relationships at a more granular level, i.e., the token level. To mitigate this, we propose Knowledge-exploited Auto-encoder for Protein (KeAP), which performs token-level knowledge graph exploration for protein representation learning. In practice, non-masked amino acids iteratively query the associated knowledge tokens to extract and integrate helpful information for restoring masked amino acids via attention. We show that KeAP can consistently outperform the previous counterpart on 9 representative downstream applications, sometimes surpassing it by large margins. These results suggest that KeAP provides an alternative yet effective way to perform knowledge enhanced protein representation learning. Code and models are available at https://github.com/RL4M/KeAP. © 2023 11th International Conference on Learning Representations, ICLR 2023. All rights reserved.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#3107 - Zhou 2024
Automated Medical Report Generation and Visual Question Answering

Zhou, Luping

Proceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine 2024;():3–4

Melbourne VIC, Australia Association for Computing Machinery 2024

DOI: 10.1145/3688868.3689189 · Ref ID: 7309

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3398 - Zhou 2024
Establishing Knowledge Preference in Language Models

Zhou, Sizhe; Li, Sha; Meng, Yu; Jiao, Yizhu; Ji, Heng; Han, Jiawei

arXiv 2024;():

2024

Ref ID: 8470

Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to provide recommendations, the model should prioritize user specifications over retrieved product reviews; when some facts are edited in the model, the updated facts should override all prior knowledge learned by the model even if they are conflicting. In all of the cases above, the model faces a decision between its own parametric knowledge, (retrieved) contextual knowledge, and user instruction knowledge. In this paper, we (1) unify such settings into the problem of knowledge preference and define a three-level preference hierarchy over these knowledge sources; (2) compile a collection of existing datasets IfQA, MQuAKE, and MRQA covering a combination of settings (with/without user specifications, with/without context documents) to systematically evaluate how well models obey the intended knowledge preference; and (3) propose a dataset synthesis method that composes diverse question-answer pairs with user assumptions and related context to directly fine-tune LMs for instilling the hierarchy of knowledge. We demonstrate that a 7B model, fine-tuned on only a few thousand examples automatically generated by our proposed method, effectively achieves superior performance (more than 18% improvement across all evaluation benchmarks) in adhering to the desired knowledge preference hierarchy.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#155 - Zhou 2024
D-Bot: Database Diagnosis System using Large Language Models

Zhou, X. H.; Li, G. L.; Sun, Z. Y.; Liu, Z. Y.; Chen, W. Z.; Wu, J. M.; Liu, J. S.; Feng, R. H.; Zeng, G. Y.

Proc. VLDB Endow. 2024;17(10):2514-2527

2024

DOI: 10.14778/3675034.3675043 · Ref ID: 3651

Database administrators (DBAs) play an important role in managing database systems. However, it is hard and tedious for DBAs to manage vast database instances and give timely response (waiting for hours is intolerable in many online cases). In addition, existing empirical methods only support limited diagnosis scenarios, which are also labor-intensive to update the diagnosis rules for database version updates. Recently large language models (LLMs) have shown great potential in various fields. Thus, we propose D-Bot, an LLM-based database diagnosis system that can automatically acquire knowledge from diagnosis documents, and generate reasonable and well-founded diagnosis report (i.e., identifying the root causes and solutions) within acceptable time (e.g., under 10 minutes compared to hours by a DBA). The techniques in D-Bot include (i) offline knowledge extraction from documents, (ii) automatic prompt generation (e.g., knowledge matching, tool retrieval), (iii) root cause analysis using tree search algorithm, and (iv) collaborative mechanism for complex anomalies with multiple root causes. We verify D-Bot on real benchmarks (including 539 anomalies of six typical applications), and the results show D-Bot can effectively identify root causes of unseen anomalies and significantly outperforms traditional methods and vanilla models like GPT-4.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1935 - Zhou 2024
Temporal Closing Path for PLM-based Temporal Knowledge Graph Completion

Zhou, X.; Shan, Y.; Dong, Z.; Liu, H.; Wang, X.

Proceedings of the International Joint Conference on Neural Networks 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

DOI: 10.1109/IJCNN60899.2024.10650003 · Ref ID: 4238

Temporal Knowledge Graph Completion (TKGC) aims to predict missing parts of quadruples, which is crucial for real-life knowledge graphs. Compared with methods that only use graph neural networks, the emergence of pre-trained model has introduced a trend of simultaneously leveraging text and graph structure information. However, most current methods based on pre-trained models struggle to effectively utilize both text and multi-hop graph structure information concurrently, resulting in insufficient association mining of relations. To address the challenge, we propose a novel model: Temporal Closing Path for Pre-trained Language Model-based TKGC (TCP-PLM). We obtain the temporal closing relation path of the target relation through sampling, and use the relation path as a bridge to simultaneously utilize text and multi-hop graph structure information. Moreover, the relation path serves as a tool for mining associations between relations. At the same time, due to the design of entity-independent relation paths, our model can also handle the inductive setting. Our experiments on three benchmarks, along with extensive analysis, demonstrate that our model not only achieves substantial performance enhancements across four metrics compared to other models but also adeptly handles inductive settings. © 2024 IEEE.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#3957 - Zhou 2024
Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

Zhou, Xin; Nie, Ping; Guo, Yiwen; Wei, Haojie; Zhang, Zhanqiu; Minervini, Pasquale; Ma, Ruotian; Gui, Tao; Zhang, Qi; Huang, Xuanjing

arXiv 2024;():

2024

Ref ID: 8736

Retrieval-Augmented Generation (RAG) significantly improved the ability of Large Language Models (LLMs) to solve knowledge-intensive tasks. While existing research seeks to enhance RAG performance by retrieving higher-quality documents or designing RAG-specific LLMs, the internal mechanisms within LLMs that contribute to the effectiveness of RAG systems remain underexplored. In this paper, we aim to investigate these internal mechanisms within the popular Mixture-of-Expert (MoE)-based LLMs and demonstrate how to improve RAG by examining expert activations in these LLMs. Our controlled experiments reveal that several core groups of experts are primarily responsible for RAG-related behaviors. The activation of these core experts can signify the model's inclination towards external/internal knowledge and adjust its behavior. For instance, we identify core experts that can (1) indicate the sufficiency of the model's internal knowledge, (2) assess the quality of retrieved documents, and (3) enhance the model's ability to utilize context. Based on these findings, we propose several strategies to enhance RAG's efficiency and effectiveness through expert activation. Experimental results across various datasets and MoE-based LLMs show the effectiveness of our method.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#1985 - Zhou 2023
Traditional Chinese Medicine Epidemic Prevention and Treatment Question-Answering Model Based on LLMs

Zhou, Z.; Yang, T.; Hu, K.

Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():4755-4760

Institute of Electrical and Electronics Engineers Inc. 2023

DOI: 10.1109/BIBM58861.2023.10385748 · Ref ID: 4919

Background: Epidemic diseases in Traditional Chinese Medicine (TCM) constitute an essential part of Chinese medical science. TCM has accumulated rich theoretical and practical experiences in the prevention and treatment of epidemic diseases, forming the academic system of epidemic febrile disease, providing robust support for epidemic prevention and resistance in TCM. However, the numerous and complex literature on TCM epidemic diseases brings challenges to the organization and discovery of epidemic disease knowledges. Objective: To leverage the powerful knowledge learning ability of state-of-the-art LLMs (LLMs) to address the efficient acquisition and utilization of TCM epidemic disease knowledges. Methods: By collecting content related to epidemic diseases from 194 ancient TCM books, as well as the knowledge graph of TCM epidemic disease prevention and treatment, we built the large TCM epidemic disease model EpidemicCHAT based on the ChatGLM model. To assess the performances of the model, several open-source LLMs were compared in the study. Results: Compared to traditional LLMs, which may fail to answer or produce hallucinations in the field of TCM epidemic diseases, EpidemicCHAT demonstrates superior answering and reasoning abilities. In the evaluation of TCM epidemic disease prescription generation, the model achieved scores of 44.02, 61.10, and 59.40 on the BLEU-4, ROUGE-L, and METEOR metrics, respectively. Conclusion: The EpidemicCHAT model proposed in this study performs excellently in the field of TCM epidemic diseases, which might provide a reference for the construction of TCM LLMs and applications such as TCM auxiliary diagnosis and Chinese herbal prescription generation. © 2023 IEEE.

Kwesi voted
brandon voted
Final decision
What was the agreed final decision?

#1733 - Zhu 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics

Zhu, D.; Chen, D.; Li, Q.; Chen, Z.; Ma, L.; Grossklags, J.; Fritz, M.

Findings of the Association for Computational Linguistics: NAACL 2024 - Findings 2024;():4737-4751

Association for Computational Linguistics (ACL) 2024

Ref ID: 4580

Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of “hallucination”, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph-a Polygraph for LLMs-as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly differs from the large body of existing research that concentrates on addressing such challenges through black-box evaluations. In particular, we demonstrate that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics during generation via tractable probabilistic models. Experimental results on various open-source LLMs confirm the efficacy of PoLLMgraph, outperforming state-of-the-art methods by a considerable margin, evidenced by over 20% improvement in AUCROC on common benchmarking datasets like TruthfulQA. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors. © 2024 Association for Computational Linguistics.

yuexi voted
Srividya voted
Final decision
What was the agreed final decision?

#321 - Zhu 2024
Graph Structure Enhanced Pre-Training Language Model for Knowledge Graph Completion

Zhu, H. S.; Xu, D. X.; Huang, Y.; Jin, Z.; Ding, W. P.; Tong, J. H.; Chong, G. S.

IEEE Trans. Emerg. Top. Comput. Intell. 2024;8(4):2697-2708

2024

DOI: 10.1109/tetci.2024.3372442 · Ref ID: 2918

A vast amount of textual and structural information is required for knowledge graph construction and its downstream tasks. However, most of the current knowledge graphs are incomplete due to the difficulty of knowledge acquisition and integration. Knowledge Graph Completion (KGC) is used to predict missing connections. In previous studies, textual information and graph structural information are utilized independently, without an effective method for fusing these two types of information. In this paper, we propose a graph structure enhanced pre-training language model for knowledge graph completion. Firstly, we design a graph sampling algorithm and a Graph2Seq module for constructing sub-graphs and their corresponding contexts to support large-scale knowledge graph learning and parallel training. It is also the basis for fusing textual data and graph structure. Next, two pre-training tasks based on masked modeling are designed for capturing accurate entity-level and relation-level information. Furthermore, this paper proposes a novel asymmetric Encoder-Decoder architecture to restore masked components, where the encoder is a Pre-trained Language Model (PLM) and the decoder is a multi-relational Graph Neural Network (GNN). The purpose of the architecture is to integrate textual information effectively with graph structural information. Finally, the model is fine-tuned for KGC tasks on two widely used public datasets. The experiments show that the model achieves excellent performance and outperforms baselines in most metrics, which demonstrate the effectiveness of our approach by fusing the structure and semantic information to knowledge graph.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#2188 - Zhu 2023
Automating Method Naming with Context-Aware Prompt-Tuning

Zhu, J.; Li, L.; Yang, L.; Ma, X.; Zuo, C.

2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) 2023;():203-214

2023

DOI: 10.1109/ICPC58990.2023.00035 · Ref ID: 6816

Method names are crucial to program comprehension and maintenance. Recently, many approaches have been proposed to automatically recommend method names and detect inconsistent names. Despite promising, their results are still suboptimal considering the three following drawbacks: 1) These models are mostly trained from scratch, learning two different objectives simultaneously. The misalignment between two objectives will negatively affect training efficiency and model performance. 2) The enclosing class context is not fully exploited, making it difficult to learn the abstract functionality of the method. 3) Current method name consistency checking methods follow a generate-then-compare process, which restricts the accuracy as they highly rely on the quality of generated names and face difficulty measuring the semantic consistency.In this paper, we propose an approach named AUMENA to AUtomate MEthod NAming tasks with context-aware prompt-tuning. Unlike existing deep learning based approaches, our model first learns the contextualized representation(i.e., class attributes) of programming language and natural language through the pre-training model, then fully exploits the capacity and knowledge of large language model with prompt-tuning to precisely detect inconsistent method names and recommend more accurate names. To better identify semantically consistent names, we model the method name consistency checking task as a two-class classification problem, avoiding the limitation of previous generate-then-compare consistency checking approaches. Experiment results reflect that AUMENA scores 68.6%, 72.0%, 73.6%, 84.7% on four datasets of method name recommendation, surpassing the state-of-the-art baseline by 8.5%, 18.4%, 11.0%, 12.0%, respectively. And our approach scores 80.8% accuracy on method name consistency checking, reaching an 5.5% outperformance. All data and trained models are publicly available.

Kwesi voted
Xinchen voted
Final decision
What was the agreed final decision?

#46 - Zhu 2023
Automated extraction of domain knowledge in the dairy industry

Zhu, J. S.; Lacroix, R.; Wade, K. M.

Comput. Electron. Agric. 2023;214():10

2023

DOI: 10.1016/j.compag.2023.108330 · Ref ID: 3291

Three weeks prior to calving to three weeks after calving, the transition period poses challenges for dairy cattle and farmers. Vast changes in housing, feeding, and reproduction might result in milk drop, metabolic and reproductive diseases. Moreover, most of the metabolic processes are intricately linked as many conditions can coexist. This challenge means that dairy producers and their advisors have difficulty drawing concise conclusions because of all aspects and relationships in transition cow management. Herein, machine-learning techniques and knowledge-graph theory were explored with a view to creating a decision-support system that could provide producers and their advisors with knowledge from domain literature. Specifically, knowledge is modelled as entities and relationships in knowledge graph theory, and natural language models were developed to extract information as knowledge graphs. A dataset comprising 1152 sentences from 20 papers was created and split into 922 sentences for training and 230 sentences for testing. Sequentially, two deep learning models were trained to extract entities and relationships respectively. For training results, a Bi-directional Long-Short-Term Memory model was applied for the entity extraction task and obtained an F1 score of 80 %. As for relationship extraction, a Transformer-based model was deployed but yielded a low F1 of 23 %, thus another pre-trained Transformer model with 89 % accuracy was deployed into the system. After feeding the domain literature into the deep -learning models, a knowledge graph of 1,576 nodes and 3,456 edges was constructed and stored in the graph database Neo4j. Afterward, a semantic parsing method was used to allow users to conduct question answering through the knowledge graph in natural language. In addition, to determine the quality of answers that the knowledge built from the papers, answers were sampled and evaluated based on human judgment. On average, answers scored 7.5 out of 10 and proved informative with respect to the original literature. Although the final interactive results demonstrated a high degree of visualization and scalability, this study primarily sought to demonstrate its feasibility. For tailored commercial applications, further improvements could be implemented in knowledge graph expansion and reasoning.

Davis voted
Srividya voted
Final decision
What was the agreed final decision?

#484 - Zhu 2023
KPT: Keyword-Guided Pre-training for Grounded Dialog Generation

Zhu, Q.; Mi, F.; Zhang, Z.; Wang, Y. S.; Li, Y. T.; Jiang, X.; Liu, Q.; Zhu, X. Y.; Huang, M. L.

37th AAAI Conference on Artificial Intelligence (AAAI) / 35th Conference on Innovative Applications of Artificial Intelligence / 13th Symposium on Educational Advances in Artificial Intelligence 2023;():14065-14073

Washington, DC Assoc Advancement Artificial Intelligence 2023

Ref ID: 3553

Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog genera-tion without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.

brandon voted
Kwesi voted
Final decision
What was the agreed final decision?

#2613 - Zhu 2011
Knowledge Management Method for Expert System Based on Cognitive Model

Zhu, S.; Kong, L.; Liu, J.

2011 International Conference of Information Technology, Computer Engineering and Management Sciences 2011;4():77-80

2011

DOI: 10.1109/ICM.2011.352 · Ref ID: 6079

A living expert system needs a mechanism to update and increase knowledge to adapt this changeable world and the knowledge acquired by different approaches need storage in order for reasoning and updating conveniently. The method presented in this paper bears this mission. Simulating the learning procedure of human beings is the core idea of this method from which we can find the ways how to add, delete, amend and use the knowledge in an expert system. Based on the analysis of the common procedure of children's actions during recognizing the world, a cognitive model of concept learning is abstracted. A general concept learning algorithm, a knowledge representation method based on general rules, a logical structure in the forest shape, and a uniform data structure for storage are accordingly presented. Thus, a complete and more scientific management case for the knowledge base of expert system is provided. At last, comparing with some ontology knowledge bases, such as CYC, Word Net, and NKI, two different characteristics of this management method are discussed.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#865 - Zhu 2017
Using Knowledge Graph And Search Query Click Logs in Statistical Language Model For Speech Recognition

Zhu, W. W.; Int Speech Commun, Assoc

18th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2017) 2017;():2735-2738

Stockholm, SWEDEN Isca-Int Speech Communication Assoc 2017

DOI: 10.21437/Interspeech.2017-1790 · Ref ID: 3027

This paper demonstrates how Knowledge Graph (KG) and Search Query Click Logs (SQCL) can be leveraged in statistical language models to improve named entity recognition for online speech recognition systems. Due to the missing in the training data, some named entities may be recognized as other common words that have the similar pronunciation. KG and SQCL cover comprehensive and fresh named entities and queries that can be used to mitigate the wrong recognition. First, all the entities located in the same area in KG are clustered together. and the queries that contain the entity names are selected from SQCL as the training data of a geographical statistical language model for each entity cluster. These geographical language models make the unseen named entities less likely to occur during the model training, and can be dynamically switched according to the user location in the recognition phase. Second, if any named entities are identified in the previous utterances within a conversational dialog, the probability of the n-best word sequence paths that contain their related entities will be increased for the current utterance by utilizing the entity relationships from KG and SQCL. This way can leverage the long-term contexts within the dialog. Experiments for the proposed approach on voice queries from a spoken dialog system yielded a 12.5% relative perplexity reduction in the language model measurement, and a 1.1% absolute word error rate reduction in the speech recognition measurement.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#971 - Zhu 2021
An Advanced Smart Contract Conversion and Its Design and Implementation for Auction Contract

Zhu, Y.; Qin, B. H.; Chen, E.; Liu, G. W.

Jisuanji Xuebao 2021;44(3):652-668

2021

DOI: 10.11897/SP.J.1016.2021.00652 · Ref ID: 5592

As second-generation blockchain technology, smart contracts have greatly enriched the functional expression of blockchain to make application development more convenient. Smart contracts are a set of digitally executable protocols which concern business, finance, contract law, and information technology. In recent years, advanced smart contract languages (ASCLs) have been proposed to solve the problem of difficult reading, comprehension, and collaboration when writing a smart contract among people in different fields. However, this kind of languages are still hard to put into practice due to the lack of an effective conversion method from the ASCLs to executable smart contract programs. Aiming at this problem, we propose a three-layer smart contract framework, including advanced smart-contract layer, basic smart-contract layer, and executable machine-code layer. After comparing and analyzing the pros and cons of several ASCLs, we take SPESC as an example to explore how to design conversion rules from its contract to target language contract in Solidity. We specify the conversion rules from two aspects. One is program architecture of the target language, which consists of main-contract and party-contracts. The corresponding rules provide an approach to convert the definition of SPESC-based contracting parties into party sub-contracts on target language, as well as to produce the rest of SPESC contract into main sub-contract on target language. The other is the approach to specify not only program architecture and storage structure on basic smart-contract layer, but also important mechanisms, including personnel management, timing control, anomaly detection, etc. These mechanisms can assist programmers to semi-automatically write smart contract programs. Moreover, by introducing the notation of group, the SPESC-based smart contract can support the operation of dynamically adding participants into the contract. We also verify the legibility of SPESC and the correctness of the conversion processes through two case studies. First, we invite some students from department of computer science and department of law. They, divided into four groups, are asked to read voting and auction contracts in SPESC and Solidity, and answer questions designed for the contracts. The result shows that the speed of reading SPESC is about twice as fast as that of reading Solidity, and the accuracy of reading SPESC is higher. Then, taking the auction contract as an instance, we analyze the process of bidding contracts and compile them into contracts in SPESC, and then provide the whole process of converting from a SPESC-based contract to an executable contract program in Solidity according to the above conversion rules, and verify the correctness of the conversion process, including coding, deploying, running, and testing, through Ethereum private chain. The instance results show that the conversion rules and the three-layer framework can simplify the writing of smart contracts, standardize the program structure, and help programmers to verify the correctness of the contract program. In our future work, a formal representation shall be established on the existing SPESC language model. Through formal methods, we can further provide formal analysis tools to verify pre-and-post conditions of contract terms, as well as time sequence between terms. Secondly, in view of the correctness of the generated Solidity target code, we can continue to improve the generated target code based on existing researches on analysis or detection vulnerabilities, optimize the program structure and specifications, and enhance the security of the contract. © 2021, Science Press. All right reserved.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3366 - Zhu 2024
EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling

Zhu, Yinghao; Ren, Changyu; Wang, Zixiang; Zheng, Xiaochen; Xie, Shiyun; Feng, Junlan; Zhu, Xi; Li, Zhoujun; Ma, Liantao; Pan, Chengwei

arXiv 2024;():

2024

Ref ID: 8339

The integration of multimodal Electronic Health Records (EHR) data has notably advanced clinical predictive capabilities. However, current models that utilize clinical notes and multivariate time-series EHR data often lack the necessary medical context for precise clinical tasks. Previous methods using knowledge graphs (KGs) primarily focus on structured knowledge extraction. To address this, we propose EMERGE, a Retrieval-Augmented Generation (RAG) driven framework aimed at enhancing multimodal EHR predictive modeling. Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models (LLMs) and aligns them with professional PrimeKG to ensure consistency. Beyond triplet relationships, we include entities' definitions and descriptions to provide richer semantics. The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses. These summaries are fused with other modalities utilizing an adaptive multimodal fusion network with cross-attention. Extensive experiments on the MIMIC-III and MIMIC-IV datasets for in-hospital mortality and 30-day readmission tasks demonstrate the superior performance of the EMERGE framework compared to baseline models. Comprehensive ablation studies and analyses underscore the efficacy of each designed module and the framework's robustness to data sparsity. EMERGE significantly enhances the use of multimodal EHR data in healthcare, bridging the gap with nuanced medical contexts crucial for informed clinical predictions.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3126 - Zhu 2024
EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented Generation

Zhu, Yinghao; Ren, Changyu; Wang, Zixiang; Zheng, Xiaochen; Xie, Shiyun; Feng, Junlan; Zhu, Xi; Li, Zhoujun; Ma, Liantao; Pan, Chengwei

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3549–3559

Boise, ID, USA Association for Computing Machinery 2024

DOI: 10.1145/3627673.3679582 · Ref ID: 7301

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#3306 - Zhu 2024
Croppable Knowledge Graph Embedding

Zhu, Yushan; Zhang, Wen; Liu, Zhiqiang; Chen, Mingyang; Liang, Lei; Chen, Huajun

arXiv 2024;():

2024

Ref ID: 8443

Knowledge Graph Embedding (KGE) is a common method for Knowledge Graphs (KGs) to serve various artificial intelligence tasks. The suitable dimensions of the embeddings depend on the storage and computing conditions of the specific application scenarios. Once a new dimension is required, a new KGE model needs to be trained from scratch, which greatly increases the training cost and limits the efficiency and flexibility of KGE in serving various scenarios. In this work, we propose a novel KGE training framework MED, through which we could train once to get a croppable KGE model applicable to multiple scenarios with different dimensional requirements, sub-models of the required dimensions can be cropped out of it and used directly without any additional training. In MED, we propose a mutual learning mechanism to improve the low-dimensional sub-models performance and make the high-dimensional sub-models retain the capacity that low-dimensional sub-models have, an evolutionary improvement mechanism to promote the high-dimensional sub-models to master the knowledge that the low-dimensional sub-models can not learn, and a dynamic loss weight to balance the multiple losses adaptively. Experiments on 3 KGE models over 4 standard KG completion datasets, 3 real application scenarios over a real-world large-scale KG, and the experiments of extending MED to the language model BERT show the effectiveness, high efficiency, and flexible extensibility of MED.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1593 - Zia 2024
Leveraging large language models for automated knowledge graphs generation in non-destructive testing

Zia, G. A. J.; Valdestilhas, A.; Torres, B. M.; Kruschwitz, S.

CEUR Workshop Proceedings 2024;3760():101-110

CEUR-WS 2024

Ref ID: 4218

This paper presents an innovative approach for the automatic generation of Knowledge Graphs (KGs) from heterogeneous scientific articles in the domain of Non-Destructive Testing (NDT) applied to building materials. Our methodology leverages large language models (LLMs) to extract and semantically relate concepts from diverse sources. We developed material-specific agents for concrete, wood, steel, and bricks, each equipped with a curated glossary of terms to ensure domain accuracy. These agents process PDF documents, extracting relevant information on deterioration mechanisms, physical changes, and applicable NDT methods. The extracted data is then normalized, validated, and structured into a Neo4j graph database, forming a comprehensive KG. Our results demonstrate the system's ability to automatically discover and represent intricate relationships between materials, deterioration mechanisms, physical changes, and NDT techniques. The generated KG successfully captures complex interactions, such as the applicability of specific NDT methods to various materials under different deterioration conditions. This work not only highlights the potential of KGs in enhancing knowledge discovery and representation in NDT research but also provides a scalable framework for extending this approach to other scientific domains. © 2024 CEUR-WS. All rights reserved.

Srividya voted
Ishan voted
Final decision
What was the agreed final decision?

#2713 - Zimina 2018
MuG-QA: Multilingual Grammatical Question Answering for RDF Data

Zimina, E.; Nummenmaa, J.; Jarvelin, K.; Peltonen, J.; Stefanidis, K.

2018 IEEE International Conference on Progress in Informatics and Computing (PIC) 2018;():57-61

2018

DOI: 10.1109/PIC.2018.8706310 · Ref ID: 6362

We introduce Multilingual Grammatical Question Answering (MuG-QA), a system for answering questions in the English, German, Italian and French languages over DBpedia. The natural language modelling and parsing is implemented using Grammatical Framework (GF), a grammar formalism having natural support for multilinguality. The question analysis is based on forming an abstract conceptual grammar from the questions, and then using linearisation of the abstract grammar into different languages to parse the questions. Once a natural language question is parsed, the resulting abstract grammar tree is matched with the knowledge base schema and contents to formulate a SPARQL query. A particular strength of our approach is that once the abstract grammar has been designed, implementation for a new concrete language is relatively quick, supposing that the language has basic support in the GF Resource Grammar Library. MuG-QA has been tested with data from the QALD-7 benchmark and showed competitive results.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#2779 - Zong 2002
Partitioning the UMLS semantic network

Zong, Chen; Perl, Y.; Halper, M.; Geller, J.; Huanying, Gu

IEEE Transactions on Information Technology in Biomedicine 2002;6(2):102-108

2002

DOI: 10.1109/TITB.2002.1006296 · Ref ID: 6479

The unified medical language system (UMLS) integrates many well-established biomedical terminologies. The UMLS semantic network (SN) can help orient users to the vast knowledge content of the UMLS metathesaurus (META) via its abstract conceptual view. However, the SN itself is large and complex and may still be difficult to comprehend. Our technique partitions the SN into smaller meaningful units amenable to display on limited-sized computer screens. The basis for the partitioning is the distribution of the relationships within the SN. Three rules are applied to transform the original partition into a second more cohesive partition.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#402 - Zou 2023
K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph

Zou, J. X.; Xie, Z. T.; Chen, J. H.; Hou, J. W.; Yan, Q.; Zheng, H. T.

32nd International Conference on Artificial Neural Networks (ICANN) 2023;14257():447-459

Heraklion, GREECE Springer International Publishing Ag 2023

DOI: 10.1007/978-3-031-44216-2_37 · Ref ID: 2923

Despite the excellent performance of pre-trained language models, such as BERT, on various natural language processing tasks, they struggle with tasks that require domain-specific knowledge. Integrating information from knowledge graphs through pre-training tasks is a common approach. However, existing models tend to focus on entity information at the word level and fail to capture the rich information in knowledge graphs. To address this issue, we propose a domain-adaptive language model pre-training framework with a knowledge graph (KDLM). K-DLM can learn both word and lexical-semantic level entity information and relationships from the knowledge graph. It predicts entity categories and sememes for masked phrases, replaces entities in sentences according to the knowledge graph, and learns relationship information via contrastive learning. The evaluation on open-domain and domain-specific tasks demonstrates that K-DLM outperforms previous models, particularly in domain-specific contexts. Our findings highlight K-DLM as an excellent pre-training framework for knowledge-driven problems that leverage domain knowledge graphs.

Xinchen voted
Kwesi voted
Final decision
What was the agreed final decision?

#3768 - Zou 2024
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

Zou, Wei; Geng, Runpeng; Wang, Binghui; Jia, Jinyuan

arXiv 2024;():

2024

Ref ID: 8095

Large language models (LLMs) have achieved remarkable success due to their exceptional generative capabilities. Despite their success, they also have inherent limitations such as a lack of up-to-date knowledge and hallucination. Retrieval-Augmented Generation (RAG) is a state-of-the-art technique to mitigate these limitations. The key idea of RAG is to ground the answer generation of an LLM on external knowledge retrieved from a knowledge database. Existing studies mainly focus on improving the accuracy or efficiency of RAG, leaving its security largely unexplored. We aim to bridge the gap in this work. We find that the knowledge database in a RAG system introduces a new and practical attack surface. Based on this attack surface, we propose PoisonedRAG, the first knowledge corruption attack to RAG, where an attacker could inject a few malicious texts into the knowledge database of a RAG system to induce an LLM to generate an attacker-chosen target answer for an attacker-chosen target question. We formulate knowledge corruption attacks as an optimization problem, whose solution is a set of malicious texts. Depending on the background knowledge (e.g., black-box and white-box settings) of an attacker on a RAG system, we propose two solutions to solve the optimization problem, respectively. Our results show PoisonedRAG could achieve a 90% attack success rate when injecting five malicious texts for each target question into a knowledge database with millions of texts. We also evaluate several defenses and our results show they are insufficient to defend against PoisonedRAG, highlighting the need for new defenses.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1719 - Zuo 2022
Patent-KG: Patent Knowledge Graph Extraction for Engineering Design

Zuo, H.; Yin, Y.; Childs, P.

Proceedings of the Design Society 2022;2():821-830

Cambridge University Press 2022

DOI: 10.1017/pds.2022.84 · Ref ID: 5380

This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in language models. The extracted entities are compared with other benchmarks in the criteria of recall rate. The result reaches the highest 0.8 recall rate in the standard list of mechanical engineering related technical terms, which means the highest coverage of engineering words. © The Author(s), 2022.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#3754 - Zuo 2021
Patent-KG: Patent Knowledge Graph Use for Engineering Design

Zuo, Haoyu; Yin, Yuan; Childs, Peter

arXiv 2021;():

2021

Ref ID: 7476

To facilitate knowledge reuse in engineering design, several dataset approaches have been proposed and applied by designers. This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in language models. This method avoids using expensive labelled data in supervised learning or listing complex syntactic rules in rule-based extraction. The extracted entities are compared with other benchmarks in the criteria of recall rate. The result reaches the highest 0.9 recall rate in the standard list of mechanical engineering related technical terms, which means the highest coverage of engineering words. The extracted relationships are also compared with other benchmarks. The result shows that our method provides more contextual information in relationships, and extracts more relationship types including positional and negation relationships.

Mike voted
Xinchen voted
Final decision
What was the agreed final decision?

#1946
Text2Story 2024 - Proceedings of Text2Story: 7th Workshop on Narrative Extraction From Texts, held in conjunction with the 46th European Conference on Information Retrieval, ECIR 2024

CEUR Workshop Proceedings 2024;3671():

CEUR-WS 2024

Ref ID: 4656

The proceedings contain 13 papers. The topics discussed include: dataset annotation and model building for identifying biases in news narratives; evaluating the ability of computationally extracted narrative maps to encode media framing; from nodes to narratives: a knowledge graph-based storytelling approach; estimating narrative durations: proof of concept; ROGER: extracting narratives using large language models from Robert Gerstmann's historical photo archive of the Sacambaya Expedition in 1928; representing complex relative chronology across narrative levels in movie plots; untangling a web of temporal relations in news articles; the geography of ‘fear’, ‘sadness’, ‘anger’ and ‘joy’: exploring the emotional landscapes in the holocaust survivors’ testimonies; and unexpected gender stereotypes in AI-generated stories: hairdressers are female, but so are doctors.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#1933
TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop

TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop 2024;():

Association for Computational Linguistics (ACL) 2024

Ref ID: 4750

The proceedings contain 7 papers. The topics discussed include: how do conversational agents in healthcare impact on patient agency?; why academia should cut back general enthusiasm about CAs; bridging the language gap: integrating language variations into conversational ai agents for enhanced user engagement; socio-cultural adapted chatbots: harnessing knowledge graphs and large language models for enhanced context awareness; how should conversational agent systems respond to sexual harassment?; non-referential functions of language in social agents: the case of social proximity; and making a long story short in conversation modeling.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1883
SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024;():

Association for Computing Machinery, Inc 2024

Ref ID: 3951

The proceedings contain 380 papers. The topics discussed include: TRAD: enhancing LLM agents with step-wise thought retrieval and aligned decision; CorpusLM: towards a unified language model on corpus for knowledge-intensive tasks; a setwise approach for effective and highly efficient zero-shot ranking with large language models; unsupervised large language model alignment for information retrieval via contrastive feedback; METAHKG: meta hyperbolic learning for few-shot temporal reasoning; transformer-based reasoning for learning evolutionary chain of events on temporal knowledge graph; contrast then memorize: semantic neighbor retrieval-enhanced inductive multimodal knowledge graph completion; and Amazon-KG: a knowledge graph enhanced cross-domain recommendation dataset.

Mike voted
Srividya voted
Final decision
What was the agreed final decision?

#1877
SemTech4STLD 2024 - 2nd International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data, co-located with the Extended Semantic Web Conference 2024, ESWC 2024

CEUR Workshop Proceedings 2024;3697():

CEUR-WS 2024

Ref ID: 4563

The proceedings contain 8 papers. The topics discussed include: GerPS-NER: a dataset for named entity recognition to support public service process creation in Germany; ChatGPT vs. Google Gemini: assessing AI frontiers for patent prior art search using European search reports; PRICER: leveraging few-shot learning with fine-tuned large language models for unstructured economic data; extracting license information from web resources with a large language model; investigating environmental, social, and governance (ESG) discussions in news: a knowledge graph analysis empowered by AI; bridging the innovation gap: leveraging patent information for scientists by constructing a patent-centric knowledge graph; automating citation placement with natural language processing and transformers; and combining knowledge graphs and large language models to ease knowledge access in software architecture research.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1875
SemTab 2022 - Proceedings of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, co-located with the 21st International Semantic Web Conference, ISWC 2022

CEUR Workshop Proceedings 2022;3320():

CEUR-WS 2022

Ref ID: 5487

The proceedings contain 13 papers. The topics discussed include: results of SemTab 2022; SOTAB: the WDC Schema.org table annotation benchmark; Wikary: a dataset of N-ary Wikipedia tables matched to qualified Wikidata statements; MammoTab: a giant and comprehensive dataset for semantic table interpretation; a large scale corpus of food composition tables; KGCODE-Tab results for SemTab 2022; from heuristics to language models: a journey through the universe of semantic table interpretation with DAGOBAH; s-elBat: a semantic interpretation approach for Messy taBle-s; JenTab: do CTA solutions affect the entire scores?; yet another milestone for Kepler-aSI at SemTab 2022; a low-resource approach to SemTab 2022; and towards an approach based on knowledge graph refinement for tabular data to knowledge graph matching.

Mike voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1874
SEMPDW 2022 - Proceedings of Poster and Demo Track and Workshop Track of the 18th International Conference on Semantic Systems, co-located with 18th International Conference on Semantic Systems, SEMANTiCS 2022

CEUR Workshop Proceedings 2022;3235():

CEUR-WS 2022

Ref ID: 5429

The proceedings contain 28 papers. The topics discussed include: attribute-based access control on solid pods using privacy-friendly credentials; language-agnostic knowledge graphs for smarter multilingual chatbots; solid proof of concept in an enterprise loan request use case; applying a mapping quality framework in cloud native monitoring; misinformation detection: using linguistic cues; a semantic policy language for usage control; proposal for PORQUE, a polylingual hybrid question answering system; Wikibase as an infrastructure for community documents: the example of the disability wiki platform; combining knowledge graphs and language models to answer questions over tables; semantifying the governance of data in Europe; towards a knowledge access & representation layer; and towards knowledge graph based services in accounting use cases.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1873
SEMPDS 2023 - Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems, co-located with 19th International Conference on Semantic Systems, SEMANTiCS 2023

CEUR Workshop Proceedings 2023;3526():

CEUR-WS 2023

Ref ID: 5194

The proceedings contain 9 papers. The topics discussed include: a framework generate, store, and publish FAIR data in experimental sciences; a mapping lifecycle for public procurement data; a toolset for normative interpretations in FLINT; developing a scalable benchmark for assessing large language models in knowledge graph engineering; enhancing interpretability of machine learning models over knowledge graphs; OntoAnon: an anonymizer for sharing ontology structure without data; SPARQLGEN: one-shot prompt-based approach for SPARQL query generation; and towards assessing FAIR research software best practices in an organization using RDF-star.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1852
Scholarly QALD 2023 and SemREC 2023 - Joint Proceedings of 1st Scholarly QALD Challenge 2023 and 4th SeMantic Answer Type, Relation and Entity Prediction Tasks Challenge 2023, co-located with 22nd International Semantic Web Conference, ISWC 2023

CEUR Workshop Proceedings 2023;3592():

CEUR-WS 2023

Ref ID: 5070

The proceedings contain 9 papers. The topics discussed include: when context matters: entity linking in the scholarly domain; NLQxform: a language model-based question to SPARQL transformer; a structure and content prompt-based method for knowledge graph question answering over scholarly data; leveraging LLMs in scholarly knowledge graph question answering; improving subgraph extraction algorithms for one-shot SPARQL query generation with large language models; PSYCHIC: a neuro-symbolic framework for knowledge graph question-answering grounding; BERTologyNavigator: advanced question answering with BERT-based semantics; enhanced GAT: expanding receptive field with meta path-guided RDF rules for two-hop connectivity; and evaluating different methods for semantic reasoning over ontologies.

Mike voted
Davis voted
Final decision
What was the agreed final decision?

#1757
Proceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023

ACM International Conference Proceeding Series 2023;():

Association for Computing Machinery 2023

Ref ID: 4760

The proceedings contain 255 papers. The topics discussed include: a comparison of mesh-free differentiable programming and data-driven strategies for optimal control under PDE constraints; autotuning Apache TVM-based scientific applications using Bayesian optimization; enhancing heterogeneous federated learning with knowledge extraction and multi-model fusion; elastic deep learning through resilient collective operations; accelerating particle and fluid simulations with differentiable and interpretable graph networks for solving forward and inverse problems; machine learning applied to single-molecule activity prediction; entropy-driven optimal sub-sampling of fluid dynamics for developing machine-learned surrogates; towards rapid autonomous electron microscopy with active meta-learning; protein generation via genome-scale language models with bio-physical scoring; Tencoder: tensor-product encoder-decoder architecture for predicting solutions of PDEs with variable boundary data; and AI/ML-derived whole-genome predictor prospectively and clinically predicts survival and response to treatment in brain cancer.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1749
Proceedings - 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024

Proceedings - 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

Ref ID: 4492

The proceedings contain 85 papers. The topics discussed include: differentially private multi-label learning is harder than you’d think; attacking operational technology without specialized knowledge: the unspecialized OT threat actor profile; towards an integrated provenance framework - a scenario for marine data; better left shift security! framework for secure software development; are you sure you want to do coordinated vulnerability disclosure?; actionable cyber threat intelligence using knowledge graphs and large language models; optimal flow collector placement in experimental networks; and a methodology to measure the ‘cost’ of cps attacks: not all CPS networks are created equal.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1690
NLP4ConvAI 2023 - 5th Workshop on NLP for Conversational AI, Proceedings of the Workshop

Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():

Association for Computational Linguistics (ACL) 2023

Ref ID: 5138

The proceedings contain 13 papers. The topics discussed include: response generation in longitudinal dialogues: which knowledge representation helps?; on the underspecification of situations in open-domain conversational datasets; correcting semantic parses with natural language through dynamic schema encoding; dialogue state tracking with sparse local slot attention; LLM-Eval: unified multi-dimensional automatic evaluation for open-domain conversations with large language models; cTBLS: augmenting large language models with conversational tables; user simulator assisted open-ended conversational recommendation system; evaluating inter-bilingual semantic parsing for Indian languages; zero-shot dialogue relation extraction by relating explainable triggers and relation names; generating video game scripts with style; and a survey of challenges and methods in the computational modeling of multi-party dialog.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1680
NeSy 2023 - Proceedings of the 17th International Workshop on Neural-Symbolic Learning and Reasoning

CEUR Workshop Proceedings 2023;3432():

CEUR-WS 2023

Ref ID: 5254

The proceedings contain 33 papers. The topics discussed include: a roadmap for neuro-argumentative learning; what's wrong with gradient-based complex query answering?; closing the neural-symbolic cycle: knowledge extraction, user intervention and distillation from convolutional neural networks; the challenge of learning symbolic representations; exploring mathematical conjecturing with large language models; learning logic constraints from demonstration; from axioms over graphs to vectors, and back again: evaluating the properties of graph-based ontology embeddings; neural-symbolic predicate invention: learning relational concepts from visual scenes; semantic interpretability of convolutional neural networks by taxonomy extraction; preliminary results on a state-driven method for rule construction in neural-symbolic reinforcement learning; and is the proof length a good indicator of hardness for reason-able embeddings?.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1668
NAIS 2023 - Proceedings of the 5th Symposium of the Norwegian AI Society

CEUR Workshop Proceedings 2023;3431():

CEUR-WS 2023

Ref ID: 5301

The proceedings contain 10 papers. The topics discussed include: crowd simulation with deliberative-reactive agents; generating natural language dialogues using large language models with adapters; the AI Act and the risks posed by generative AI models; Bayesian exploration in deep reinforcement learning; analyzing literary texts in Lithuanian sign language with computer vision: a proof of concept; automatic detection of manipulative consent management platforms and the journey into the patterns of darkness; EvoLP.jl: a playground for evolutionary computation in Julia; making sense of nonsense: integrated gradient-based input reduction to improve recall for check-worthy claim detection; construction of a relevance knowledge graph with application to the LOCAL news angle; and container-based IoT architectures: use case for visual person counting.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1649
MLSMKG 2021 - Machine Learning with Symbolic Methods and Knowledge Graphs, co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2021

CEUR Workshop Proceedings 2021;2997():

CEUR-WS 2021

Ref ID: 5670

The proceedings contain 4 papers. The topics discussed include: ontology-based N-ball concept embeddings informing few-shot image classification; contextual graph representation learning in text disambiguation; contextual language models for knowledge graph completion; and on refining BERT contextualized embeddings using semantic lexicons.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1551
Lang + Mol 2024 - 1st Workshop on Language + Molecules, Proceedings of the Workshop

Lang + Mol 2024 - 1st Workshop on Language + Molecules, Proceedings of the Workshop 2024;():

Association for Computational Linguistics (ACL) 2024

Ref ID: 4212

The proceedings contain 15 papers. The topics discussed include: could chemical language models benefit from message passing; ALMol: aligned language-molecule translation LLMs through offline preference contrastive optimization; evaluating extrapolation ability of large language model in chemical domain; design proteins using large language models: enhancements and comparative analyses; enhanced biot5+ for molecule-text translation: a three-stage approach with data distillation, di verse training, and voting ensemble; SciMind: a multimodal mixture-of-experts model for advancing pharmaceutical sciences; knowledge graph extraction from total synthesis documents; and Knowlab’s submission to L+M shared task: all you need is continued pretraining of chemistry texts even for molecule captioning.

mohammed afaan voted
Ishan voted
Final decision
What was the agreed final decision?

#1388
ICISE 2021 - 2021 6th International Conference on Information Systems Engineering

ACM International Conference Proceeding Series 2021;():

Association for Computing Machinery 2021

Ref ID: 5539

The proceedings contain 18 papers. The topics discussed include: big data: finding frequencies of faulty multimedia data; financial big data security and privacy in x-accounting. a step further to implement the triple-entry accounting; research on real-time data warehouse technology for sea battlefield; business planning and big data, budget modelling upgrade through data science; predicting the total population development of China based on logistic blocking growth model and improved grey GM (1,1) prediction model; learning knowledge uncertainty from the pretrained language model; dual-channel BERT-DBLCA based on attention mechanism for news category label classification model; research on the development of key technologies of tactical edge cloud; and research on method of undesirable text recognition based on deep learning and knowledge graph.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1233
EKG-LLM 2023 - Proceedings of the Workshop on Enterprise Knowledge Graphs using Large Language Models, co-located with 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023

CEUR Workshop Proceedings 2023;3532():

CEUR-WS 2023

Ref ID: 5156

The proceedings contain 6 papers. The topics discussed include: EduEmbedd - a knowledge graph embedding for education; related table search for numeric data using large language models and enterprise knowledge graphs; cognitive retrieve: empowering document retrieval with semantics and domain specific knowledge graph; CRUSH: cybersecurity research using universal LLMs and semantic hypernetworks; and StATIK+: structure and text for inductive knowledge graph modeling and paths towards enterprise implementations.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1232
EKAW-C 2022 - Companion Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge Management

CEUR Workshop Proceedings 2022;3256():

CEUR-WS 2022

Ref ID: 5500

The proceedings contain 11 papers. The topics discussed include: experiment maker: a tool to create experiments with GPT-3 Easily; CrowdIQ: an ontology for crowdsourced information quality assessments; automated identification of flaky builds using knowledge graphs; ATONTE: towards a new methodology for seed ontology development from texts and experts; a step toward semantic content negotiation; FAIR ontologies, FAIR ontology alignments; extracting structured knowledge from Dutch legal texts: a rule-based approach; knowledge-based legal document retrieval: a case study on Italian civil court decisions; ITALIAN-LEGAL-BERT: a pre-trained transformer language model for Italian law; and public procurement fraud detection and artificial intelligence techniques: a literature review.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1202
DL4KG 2023 - Proceedings of the Workshop on Deep Learning for Knowledge Graphs, co-located with the 21st International Semantic Web Conference, ISWC 2023

CEUR Workshop Proceedings 2023;3559():

CEUR-WS 2023

Ref ID: 5052

The proceedings contain 7 papers. The topics discussed include: location query answering using box embeddings; knowledge graph injection for reinforcement learning; benchmarking the abilities of large language models for RDF knowledge graph creation and comprehension: how well do LLMs speak turtle?; enhancing large language models with knowledge graphs for classification tasks in the tourism domain; universal preprocessing operators for embedding knowledge graphs with literals; NNKGC: improving knowledge graph completion with node neighborhoods; and enhancing scholarly understanding: a comparison of knowledge injection strategies in large language models.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1168
Deep Learning Inside Out: 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO 2021 - Proceedings, co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics

Deep Learning Inside Out: 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO 2021 - Proceedings, co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics 2021;():

Association for Computational Linguistics (ACL) 2021

Ref ID: 5666

The proceedings contain 14 papers. The topics discussed include: transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors; reconstructing implicit knowledge with language models; investigating the effect of background knowledge on natural questions; augmenting topic aware knowledge-grounded conversations with dynamic built knowledge graphs; what makes my model perplexed? a linguistic investigation on neural language models perplexity; how do BERT embeddings organize linguistic knowledge?; enhancing multiple-choice question answering with causal knowledge; and low anisotropy sense retrofitting (LASeR) : towards isotropic and sense enriched representations.

Ishan voted
Srividya voted
Final decision
What was the agreed final decision?

#1157
D2R2 2024 - Proceedings of the 3rd International Workshop on Linked Data-Driven Resilience Research, co-located with European Semantic Web Conference 2024, ESWC 2024

CEUR Workshop Proceedings 2024;3707():

CEUR-WS 2024

Ref ID: 4538

The proceedings contain 8 papers. The topics discussed include: empowering supply chains resilience: LLMs-powered BN for proactive supply chain risk identification; anticipate risk with the value and trade flows knowledge graph; entity alignment for knowledge graphs in the context of supply chain risk management; leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance; towards a regional public dashboard for crisis and resilience management; an automated evaluation framework for graph database query generation leveraging large language models; and towards modeling the structure of product dependencies in supply networks to identify bottlenecks among suppliers.

Srividya voted
Mike voted
Final decision
What was the agreed final decision?

#1118
CONDA 2024 - 1st Data Contamination Workshop, Proceedings of the Workshop

CONDA 2024 - 1st Data Contamination Workshop, Proceedings of the Workshop 2024;():

Association for Computational Linguistics (ACL) 2024

Ref ID: 4383

The proceedings contain 4 papers. The topics discussed include: evaluating Chinese large language models on discipline knowledge acquisition via memorization and robustness assessment; confounders in instance variation for the analysis of data contamination; a taxonomy for data contamination in large language models; and data contamination report from the 2024 CONDA shared task.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1082
CMLDS 2024 - 2024 International Conference on Computing, Machine Learning and Data Science, Conference Proceedings

ACM International Conference Proceeding Series 2024;():

Association for Computing Machinery 2024

Ref ID: 4031

The proceedings contain 60 papers. The topics discussed include: privacy-preservation robust federated learning with blockchain-based hierarchical framework; spatio-temporal hypergraph convolutional network based network traffic prediction; hash function based on quantum walks with two-step memory; adversarial analysis and methods for math word problems; object tracking based on adaptive multi-template fusing; analysis of spatial-temporal variability and heterogeneity of soil moisture; can deep learning large language models be used to unravel knowledge graph creation?; binary and multi-label machine learning models for discrete-time survival analysis: a case study to predict complications and mortality in Thai diabetic patients; power factor anomaly detection using data stream summaries; and consensus filter for distributed sensor networks with unknown colored noise.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#1057
Case-Based Reasoning Research and Development - 31st International Conference, ICCBR 2023, Proceedings

Lect. Notes Comput. Sci. 2023;14141 LNAI():

2023

Ref ID: 5296

The proceedings contain 26 papers. The special focus in this conference is on Case-Based Reasoning Research and Development. The topics include: CBR Driven Interactive Explainable AI; selecting Explanation Methods for Intelligent IoT Systems: A Case-Based Reasoning Approach; CBR-fox: A Case-Based Explanation Method for Time Series Forecasting Models; group Fairness in Case-Based Reasoning; Addressing Underestimation Bias in CBR Through Case-Base Maintenance; towards Addressing Problem-Distribution Drift with Case Discovery; case-Based Adaptation of Argument Graphs with WordNet and Large Language Models; Failure-Driven Transformational Case Reuse of Explanation Strategies in CloodCBR; a Case-Based Approach for Workflow Flexibility by Deviation; synergies Between Case-Based Reasoning and Deep Learning for Survival Analysis in Oncology; lazy Adaptation Knowledge Learning Based on Frequent Closed Itemsets; an Overview and Comparison of Case-Based Reasoning Frameworks; case-Based Cleaning of Text Images; a Multi-agent Case-Based Reasoning Intrusion Detection System Prototype; a Case-Based Reasoning Approach to Company Sector Classification Using a Novel Time-Series Case Representation; an Integrated Approach to Predicting the Influence of Reputation Mechanisms on Q&A Communities; retrieval of Similar Cases to Improve the Diagnosis of Diabetic Retinopathy; CBR Assisted Context-Aware Surface Realisation for Data-to-Text Generation; explanation of Similarities in Process-Oriented Case-Based Reasoning by Visualization; on-Demand and Model-Driven Case Building Based on Distributed Data Sources; the Case for Circularities in Case-Based Reasoning; a Contextual Information-Augmented Probabilistic Case-Based Reasoning Model for Knowledge Graph Reasoning; case-Based Sample Generation Using Multi-Armed Bandits; hybrid Event Memory as a Case Base for State Estimation in Cognitive Agents.

Xinchen voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#1045
CAiSE-DC 2024 - Proceedings of the Doctoral Consortium Papers Presented at the 36th International Conference on Advanced Information Systems Engineering

CEUR Workshop Proceedings 2024;3767():

CEUR-WS 2024

Ref ID: 4135

The proceedings contain 8 papers. The topics discussed include: from adoption to endurance: exploring the dynamics of general-purpose AI adoption across time and contexts; intelligent perception systems for multi-modal data processing in industrial application contexts; a conceptual modeling-based journey into variant interpretation: from unpacking to operationalization; a methodological approach to model-driven software development for quality assurance in metaverse environments; integrating LLMs with knowledge graphs-enhanced task-oriented dialogue systems; translating polygenic risk score research to a clinical setting; comparable and repeatable information security level evaluation; and selecting adequate machine learning methods for human-computer interaction data sets: guidelines and a conceptual structure.

mohammed afaan voted
yuexi voted
Final decision
What was the agreed final decision?

#905
8th China Conference on Knowledge Graph and Semantic Computing, CCKS 2023

Commun. Comput. Info. Sci. 2023;1923 CCIS():

2023

Ref ID: 5148

The proceedings contain 28 papers. The special focus in this conference is on Knowledge Graph and Semantic Computing. The topics include: A Generalized Strategy of Chinese Grammatical Error Diagnosis Based on Task Decomposition and Transformation; conversational Search Based on Utterance-Mask-Passage Post-training; financial Fraud Detection Based on Deep Learning: Towards Large-Scale Pre-training Transformer Models; GERNS: A Graph Embedding with Repeat-Free Neighborhood Structure for Subgraph Matching Optimization; feature Enhanced Structured Reasoning for Question Answering; conditional Knowledge Graph: Design, Dataset and a Preliminary Model; ODKG: An Official Document Knowledge Graph for the Effective Management; CCD-ASQP: A Chinese Cross-Domain Aspect Sentiment Quadruple Prediction Dataset; move Structure Recognition in Scientific Papers with Saliency Attribution; causE: Towards Causal Knowledge Graph Embedding; Moral Essential Elements: MEE-A Dataset for Moral Judgement; improving Adaptive Knowledge Graph Construction via Large Language Models with Multiple Views; single Source Path-Based Graph Neural Network for Inductive Knowledge Graph Reasoning; a Graph Learning Based Method for Inductive Knowledge Graph Relation Prediction; LLM-Based SPARQL Generation with Selected Schema from Large Scale Knowledge Base; Robust NL-to-Cypher Translation for KBQA: Harnessing Large Language Model with Chain of Prompts; in-Context Learning for Knowledge Base Question Answering for Unmanned Systems Based on Large Language Models; a Military Domain Knowledge-Based Question Answering Method Based on Large Language Model Enhancement; Advanced PromptCBLUE Performance: A Novel Approach Leveraging Large Language Models; exploring the Logical Expressiveness of Graph Neural Networks by Establishing a Connection with C2 ; research on Joint Representation Learning Methods for Entity Neighborhood Information and Description Information; harvesting Event Schemas from Large Language Models; NTDA: Noise-Tolerant Data Augmentation for Document-Level Event Argument Extraction; Event-Centric Opinion Mining via In-Context Learning with ChatGPT; relation Repository Based Adaptive Clustering for Open Relation Extraction.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#902
6th China Conference on Knowledge Graph and Semantic Computing, CCKS 2021

Commun. Comput. Info. Sci. 2022;1553 CCIS():

2022

Ref ID: 5514

The proceedings contain 17 papers. The special focus in this conference is on Knowledge Graph and Semantic Computing. The topics include: Enhance Both Text and Label: Combination Strategies for Improving the Generalization Ability of Medical Entity Extraction; knowledge-Enhanced Retrieval: A Scheme for Question Answering; multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model; multi-strategies Integrated Information Extraction for Scholar Profiling Task; named Entity Recognition and Event Extraction in Chinese Electronic Medical Records; strategies for Enhancing Generalization Ability of Communication Event Co-reference Resolution; Unmanned Aerial Vehicle Knowledge Graph Construction with SpERT; Method Description for CCKS 2021 Task 3: A Classification Approach of Scholar Structured Information Extraction from HTML Web Pages; a Dual-Classifier Model for General Fine-Grained Event Detection Task; a Joint Training Framework Based on Adversarial Perturbation for Video Semantic Tags Classification; a Multi-modal System for Video Semantic Understanding; an Integrated Method of Semantic Parsing and Information Retrieval for Knowledge Base Question Answering; Basic Profiling Extraction Based on XGBoost; data Augmentation Based on Pre-trained Language Model for Event Detection; Does BERT Know Which Answer Beyond the Question?.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#901
4th International Conference on Cognitive Computing, ICCC 2020, held as part of Services Conference Federation, SCF 2020

Lect. Notes Comput. Sci. 2020;12408 LNCS():

2020

Ref ID: 5714

The proceedings contain 10 papers. The special focus in this conference is on Cognitive Computing. The topics include: PRTransE: Emphasize More Important Facts Based on Pagerank for Knowledge Graph Completion; context Based Quantum Language Model with Application to Question Answering; improving Fake Product Detection with Aspect-Based Sentiment Analysis; a Dual Layer Regression Model for Cross-border E-commerce Industry Sale and Hot Product Prediction; end-to-End Nested Multi-Attention Network for 3D Brain Tumor Segmentation; ALBERT-Based Chinese Named Entity Recognition; cognitive and Predictive Analytics on Big Open Data; semantic Enhancement Based Dynamic Construction of Domain Knowledge Graph.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#959
42nd European Conference on IR Research, ECIR 2020

Lect. Notes Comput. Sci. 2020;12036 LNCS():

2020

Ref ID: 5791

The proceedings contain 144 papers. The special focus in this conference is on IR Research. The topics include: Neural embedding-based metrics for pre-retrieval query performance prediction; a latent model for ad hoc table retrieval; hybrid semantic recommender system for chemical compounds; Assessing the impact of OCR errors in information retrieval; towards query logs for privacy studies: On deriving search queries from questions; machine-actionable data management plans: A knowledge retrieval approach to automate the assessment of funders’ requirements; session-based path prediction by combining local and global content preferences; unsupervised ensemble of ranking models for news comments using pseudo answers; irony detection in a multilingual context; document network projection in pretrained word embedding space; the effect of content-equivalent near-duplicates on the evaluation of search engines; supervised learning methods for diversification of image search results; ANTIQUE: A non-factoid question answering benchmark; neural query-biased abstractive summarization using copying mechanism; distant supervision for extractive question summarization; text-image-video summary generation using joint integer linear programming; domain adaptation via context prediction for engineering diagram search; crowdsourcing truthfulness: The impact of judgment scale and assessor bias; novel and diverse recommendations by leveraging linear models with user and item embeddings; a multi-task approach to open domain suggestion mining using language model for text over-sampling; medLinker: Medical entity linking with neural representations and dictionary matching; From MaxSCORE to block-max WAND: The story of how lucene significantly improved query evaluation performance; ranking significant discrepancies in clinical reports; teaching a new dog old tricks: Resurrecting multilingual retrieval using zero-shot learning.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#942
21st International Semantic Web Conference, ISWC 2022

Lect. Notes Comput. Sci. 2022;13489 LNCS():

2022

Ref ID: 5454

The proceedings contain 48 papers. The special focus in this conference is on International Semantic Web. The topics include: H2TNE : Temporal Heterogeneous Information Network Embedding in Hyperbolic Spaces; facing Changes: Continual Entity Alignment for Growing Knowledge Graphs; Mapping Relational Database Constraints to SHACL; POSO: A Generic Positioning System Ontology; each Snapshot to Each Space: Space Adaptation for Temporal Knowledge Graph Completion; efficient Dependency Analysis for Rule-Based Ontologies; heterogeneous Graph Neural Network with Hypernetworks for Knowledge Graph Embedding; MultPAX: Keyphrase Extraction Using Language Models and Knowledge Graphs; RT-KGD: Relation Transition Aware Knowledge-Grounded Dialogue Generation; Faithful Embeddings for EL+ + Knowledge Bases; LoGNet: Local and Global Triple Embedding Network; an Analysis of Content Gaps Versus User Needs in the Wikidata Knowledge Graph; Repairing SHACL Constraint Violations Using Answer Set Programming; entity Type Prediction Leveraging Graph Walks and Entity Descriptions; Strabo 2: Distributed Management of Massive Geospatial RDF Datasets; Controlled Query Evaluation in OWL 2 QL: A “Longest Honeymoon” Approach; a Survey of Syntactic Modelling Structures in Biomedical Ontologies; HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs; GNNQ: A Neuro-Symbolic Approach to Query Answering over Incomplete Knowledge Graphs; Radar Station: Using KG Embeddings for Semantic Table Interpretation and Entity Disambiguation; enhancing Document-Level Relation Extraction by Entity Knowledge Injection; CRNet: Modeling Concurrent Events over Temporal Knowledge Graph; LODChain: Strengthen the Connectivity of Your RDF Dataset to the Rest LOD Cloud; WDV: A Broad Data Verbalisation Dataset Built from Wikidata; machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching; The DLCC Node Classification Benchmark for Analyzing Knowledge Graph Embeddings; μKG : A Library for Multi-source Knowledge Graph Embeddings and Applications; IMGT-KG: A Knowledge Graph for Immunogenetics; REBench: Microbenchmarking Framework for Relation Extraction Systems; WDBench: A Wikidata Graph Query Benchmark.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#940
21st International Conference on Service-Oriented Computing, ICSOC 2023

Lect. Notes Comput. Sci. 2023;14420 LNCS():

2023

Ref ID: 5037

The proceedings contain 48 papers. The special focus in this conference is on Service-Oriented Computing. The topics include: IDLGen: Automated Code Generation for Inter-parameter Dependencies in Web APIs; time-Aware Log Anomaly Detection Based on Growing Self-organizing Map; an Empirical Evaluation of the Energy and Performance Overhead of Monitoring Tools on Docker-Based Systems; chainsFormer: A Chain Latency-Aware Resource Provisioning Approach for Microservices Cluster; energy-Efficient and Communication-Aware Resource Allocation in Container-Based Cloud with Group Genetic Algorithm; engineering Self-adaptive Microservice Applications: An Experience Report; FUSE: Fault Diagnosis and Suppression with eBPF for Microservices; serviceSim: A Modelling and Simulation Toolkit of Microservice Systems in Cloud-Edge Environment; 2DPChain: Orchestrating Transactions in Order-Execute Blockchain to Exploit Intra-batch and Inter-batch Parallelism; deep Learning Model for Personalized Web Service Recommendations Using Attention Mechanism; a Dynamical Model for the Nonlinear Features of Value-Driven Service Ecosystem Evolution; a Middleware for Hybrid Blockchain Applications: Towards Fast, Affordable, and Accountable Integration; An AI Chatbot for Explaining Deep Reinforcement Learning Decisions of Service-Oriented Systems; BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM; dependency-Aware Resource Allocation for Serverless Functions at the Edge; distributing Quantum Computations, by Shots; energy-Efficient Task Offloading with Statistic QoS Constraint Through Multi-level Sleep Mode in Ultra-Dense Network; enhancing Blockchain Performance via On-chain and Off-chain Collaboration; deep Reinforcement Learning-Based Scheduling for Same Day Delivery with a Dynamic Number of Drones; designing Reconfigurable Intelligent Systems with Markov Blankets; exploiting Category Information in Sequential Recommendation; Niagara: Scheduling DNN Inference Services on Heterogeneous Edge Processors; plan, Generate and Match: Scientific Workflow Recommendation with Large Language Models; predicting Effect and Cost of Microservice System Evolution Using Graph Neural Network; qoS Prediction via Multi-scale Feature Fusion Based on Convolutional Neural Network.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#933
20th European, Mediterranean, and Middle Eastern Conference, EMCIS 2023

Lect. Notes Bus. Inf. Process. 2024;502 LNBIP():

2024

Ref ID: 4652

The proceedings contain 43 papers. The special focus in this conference is on European, Mediterranean, and Middle Eastern. The topics include: Web Mining for Estimating Regulatory Blockchain Readiness; reviewing the Role of Secret Sharing Schemes in Electronic Payment Protocols; Decentralization of DAOs: A Fundamental Analysis; Blockchain-Powered NFTs: A Paradigm Shift in Carbon Credit Transactions for Traceability, Transparency, and Accountability; a Blockchain Framework for Digital Asset Ownership and Transfer in Succession; perspectives of Merchants Regarding Bitcoin’s Role as a Currency and Its Utility as a Payment System; a Chatbot Generator for Improved Digital Governance; A Structured Analysis of Domain-Specific Linked Open Vocabularies (LOV): Indicators for Interoperability and Reusability; predicting Digital Winners and Losers in Economic Crises Using Artificial Intelligence and Open Government Data; chatbot Technology Assessment: 40 Cases from Greece; the Effects of Economic Crisis on the Digitalization of the Greek Social Security; design, Implementation, and Evaluation of a Food Price Monitoring Tool for Supporting Data Journalists; Smartphone Apps for Parents of Preterm Infants from NICU to Home: A Quality, Evidence-Based Content and Data Protection Assessment; assessing the Progress of Portuguese Hospitals’ Online Services; Α Cross-Sector Data Space for Correlating Environmental Risks with Human Health; using Computational Knowledge Extraction Approach to Assess Three Decades of Health Management Information Systems for Informed Actions; the Role of Artificial Ethics Principles in Managing Knowledge and Enabling Data-Driven Decision Making in Supply Chain Management; fine-Tuning Large-Scale Project Scheduling; Integrating LLMs in Higher Education, Through Interactive Problem Solving and Tutoring: Algorithmic Approach and Use Cases.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#961
2024 14th International Conference on Pattern Recognition Systems, ICPRS 2024

2024 14th International Conference on Pattern Recognition Systems, ICPRS 2024 2024;():

Institute of Electrical and Electronics Engineers Inc. 2024

Ref ID: 4133

The proceedings contain 42 papers. The topics discussed include: LLM-aided knowledge graph construction for zero-shot visual object state classification; enhancing Apple’s defect classification: insights from visible spectrum and narrow spectral band imaging; analyzing emotional and topical patterns in conspiracy theory narratives: a discourse comparative study on the 2023 Hawaii wildfires; non-invasive estimation of moisture content in mushrooms using hyperspectral imaging and machine learning-based stacking regressor model; a concept drift based approach to evaluating model performance and theoretical lifespan; autism spectrum disorder prediction using machine learning classifiers; adversarial contrastive representation learning for passive Wi-Fi fingerprinting of individuals; and SAI-ChileanDiet: a multi-label food dataset with self-acquired images of the Chilean diet.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#896
1st Workshop on Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning, NeusymBridge 2024 at LREC-COLING 2024 - Workshop Proceedings

1st Workshop on Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning, NeusymBridge 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;():

European Language Resources Association (ELRA) 2024

Ref ID: 4617

The proceedings contain 5 papers. The topics discussed include: probing large language models from a human behavioral perspective; the semantic relations in LLMs: an information-theoretic compression approach; word sense disambiguation as a game of neurosymbolic darts; open event causality extraction by the assistance of LLM in task annotation, dataset, and method; and the need for grounding in LLM-based dialogue systems.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#895
1st Working conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow, AI Tomorrow 2023

Informatik aktuell 2024;():

Springer Science and Business Media Deutschland GmbH 2024

Ref ID: 4679

The proceedings contain 12 papers. The special focus in this conference is on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow. The topics include: AI-Powered Knowledge and Expertise Mining in Healthcare from a Field Experiment; iterative Development of a Process-Oriented Approach for the Selection of Platform-Based Digital Services; classification of Static Poses Based on Key Point Detection for Application of Incriminated Image Files; Human Centered Implementation Process of AI in SMEs – Conditions for Success; LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT; Foundations for the Development of an AI-based, Platformindipendent cOmpanion-app [for] Lifelong Learning-Optimization (APOLLO); viability of Knowledge Management Practices for a Successful Digital Transformation in Small- and Medium- Sized Enterprises; identification of Machine Learning Algorithms to Share Tacit Experimental Knowledge in Manual Production; An Application of AI for Online Estimation of the Impact of Imperfections in Additive Manufactured Components.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#894
1st International Workshop on Natural Scientific Language Processing and Research Knowledge Graphs, NSLP 2024

Lect. Notes Comput. Sci. 2024;14770 LNAI():

2024

Ref ID: 4500

The proceedings contain 21 papers. The special focus in this conference is on Natural Scientific Language Processing and Research Knowledge Graphs. The topics include: Towards a Novel Classification of Table Types in Scholarly Publications; OCR Cleaning of Scientific Texts with LLMs; RTaC: A Generalized Framework for Tooling; scientific Software Citation Intent Classification Using Large Language Models; repoFromPaper: An Approach to Extract Software Code Implementations from Scientific Publications; Automated Extraction of Research Software Installation Instructions from README Files: An Initial Analysis; a Technical/Scientific Document Management Platform; the Effect of Knowledge Graph Schema on Classifying Future Research Suggestions; assessing the Overlap of Science Knowledge Graphs: A Quantitative Analysis; FoRC@NSLP2024: Overview and Insights from the Field of Research Classification Shared Task; NRK at FoRC 2024 Subtask I: Exploiting BERT-Based Models for Multi-class Classification of Scholarly Papers; advancing Automatic Subject Indexing: Combining Weak Supervision with Extreme Multi-label Classification; single-Label Multi-modal Field of Research Classification; Enriched BERT Embeddings for Scholarly Publication Classification; SOMD@NSLP2024: Overview and Insights from the Software Mention Detection Shared Task; Software Mention Recognition with a Three-Stage Framework Based on BERTology Models at SOMD 2024; ABCD Team at SOMD 2024: Software Mention Detection in Scholarly Publications with Large Language Models; falcon 7b for Software Mention Detection in Scholarly Documents; enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#931
19th International Conference on Wisdom, Well-Being, Win-Win, iConference 2024

Lect. Notes Comput. Sci. 2024;14597 LNCS():

2024

Ref ID: 4604

The proceedings contain 91 papers. The special focus in this conference is on Wisdom, Well-Being and Win-Win. The topics include: Identifying the Potential Users of Community Archives: A Case Study of the History of the Chinese 30 Years Project; What Motivates You to Use VR Exergames to Substitute for Real Sports?—An Empirical Study Based on Technology Readiness and Technology Acceptance Model; the Filtered Appeal: Evaluating the Impact of Appearance Enhancement on Effectiveness of Donation Requests; “If I Like BLANK, What Else Will I Like?”: Analyzing a Human Recommendation Community on Reddit; can Chatbot Anthropomorphism and Empathy Mitigate the Impact of Customer Anger on Satisfaction?; understanding Users’ Decision-Making on Privacy Disclosure from a Configurational Perspective Perceived Values, Privacy Concerns, Cognitive Style, and Trust; genre Recognition: A Model of Behaviour; “How I Form and Escape Information Cocoons”: An Interview Study of Users on Short Video Apps; are Older People Battling with Digital Financial Services?; plant-Based Predictions: An Exploratory Predictive Analysis of Purchasing Behavior of Meat-Alternatives by U.S. Consumers (2020); AIGC-Enabled Interdisciplinary Science Measurement; Role of Emotional Experience in AI Voice Assistant User Experience in Voice Shopping; a Contextualized Government Service Chatbot for Individuals with limited Information Literacy; Detection Vs. Anti-detection: Is Text Generated by AI Detectable?; privacyChat: Utilizing Large Language Model for Fine-Grained Information Extraction over Privacy Policies; reimagining Data Science Methodology for Community Well-Being Through Intersectional Feminist Voices; participatory Observation Methods Within Data-Intensive Science: Formal Evaluation and Sociotechnical Insight; from Knowledge Representation to Knowledge Organization and Back; understanding Researchers’ Data-Centric Tasks: A Classification of Goals, Gaps, and Resources.

yuexi voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#930
19th China National Conference on Computational Linguistics, CCL 2020

Lect. Notes Comput. Sci. 2020;12522 LNAI():

2020

Ref ID: 5771

The proceedings contain 34 papers. The special focus in this conference is on Computational Linguistics. The topics include: Chinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism; a Practice of Tourism Knowledge Graph Construction Based on Heterogeneous Information; a Novel Joint Framework for Multiple Chinese Events Extraction; entity Relative Position Representation Based Multi-head Selection for Joint Entity and Relation Extraction; a Mixed Learning Objective for Neural Machine Translation; multi-reward Based Reinforcement Learning for Neural Machine Translation; low-Resource Text Classification via Cross-Lingual Language Model Fine-Tuning; constructing Uyghur Named Entity Recognition System Using Neural Machine Translation Tag Projection; recognition Method of Important Words in Korean Text Based on Reinforcement Learning; semantic-Aware Chinese Zero Pronoun Resolution with Pre-trained Semantic Dependency Parser; mongolian Questions Classification Based on Multi-Head Attention; the Annotation Scheme of English-Chinese Clause Alignment Corpus; categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explanation Tool; LiveQA: A Question Answering Dataset Over Sports Live; Chinese and English Elementary Discourse Units Recognition Based on Bi-LSTM-CRF Model; better Queries for Aspect-Category Sentiment Classification; multimodal Sentiment Analysis with Multi-perspective Fusion Network Focusing on Sense Attentive Language; CAN-GRU: A Hierarchical Model for Emotion Recognition in Dialogue; a Joint Model for Aspect-Category Sentiment Analysis with Shared Sentiment Prediction Layer; compress Polyphone Pronunciation Prediction Model with Shared Labels; improving Sentence Classification by Multilingual Data Augmentation and Consensus Learning; multi-task Legal Judgement Prediction Combining a Subtask of the Seriousness of Charges; clickbait Detection with Style-Aware Title Modeling and Co-attention.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#924
17th International Conference on Knowledge Science, Engineering and Management, KSEM 2024

Lect. Notes Comput. Sci. 2024;14887 LNAI():

2024

Ref ID: 4106

The proceedings contain 160 papers. The special focus in this conference is on Knowledge Science, Engineering and Management. The topics include: EE-LCE: An Event Extraction Framework Based on LLM-Generated CoT Explanation; attention and Learning Features-Enhanced Knowledge Tracing; An MLM Decoding Space Enhancement for Legal Document Proofreading; meta-pruning: Learning to Prune on Few-Shot Learning; knowledge-Informed Molecular Learning: A Survey on Paradigm Transfer; GenFlowchart: Parsing and Understanding Flowchart Using Generative AI; DSCVSR: A Lightweight Video Super-Resolution for Arbitrary Magnification; programming Knowledge Tracing with Context and Structure Integration; an Konwledge-Based Semi-supervised Active Learning Method for Precision Pest Disease Diagnostic; multi-label Feature Selection with Adaptive Subspace Learning; User Story Classification with Machine Learning and LLMs; PTMA: Pre-trained Model Adaptation for Transfer Learning; optimization Strategies for Knowledge Graph Based Distractor Generation; reinforced Subject-Aware Graph Neural Network for Related Work Generation; EFCC-IeT: Cross-Modal Electronic File Content Correlation via Image-Enhanced Text; multi-relation Neural Network Recommendation Model Based on Knowledge Graph Embedding Algorithm; link Prediction Based on Deep Global Information in Heterogeneous Graph; subject Knowledge Entity Relationship Extraction Based on Multi-feature Fusion and Relation Specific Horns Tagging; a Human-Computer Negotiation Model Based on Q-Learning; affine Transformation-Based Knowledge Graph Embedding; integrating Prior Scenario Knowledge for Composition Review Generation; distant Supervised Relation Extraction on Pre-train Model with Improved Multi-label Attention Mechanism; sEMG-Based Multi-view Feature-Constrained Representation Learning; vicinal Data Augmentation for Classification Model via Feature Weaken; STM: An Improved Peak Price Tracking-Based Online Portfolio Selection Algorithm; spatiotemporal Dependence Learning with Meteorological Context for Transportation Demand Prediction; automatic Meter Pointer Reading Based on Knowledge Distillation; multi-table Question Answering Method Based on Correlation Evaluation and Precomputed Cube; a Joint Multi-task Learning Model for Web Table-to-Knowledge Graph Matching; an In-Context Schema Understanding Method for Knowledge Base Question Answering.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#917
13th International Conference on Knowledge Science, Engineering and Management, KSEM 2020

Lect. Notes Comput. Sci. 2020;12274 LNAI():

2020

Ref ID: 5719

The proceedings contain 85 papers. The special focus in this conference is on Knowledge Science, Engineering and Management. The topics include: A dynamic answering path based fusion model for KGQA; Improving deep item-based collaborative filtering with Bayesian personalized ranking for MOOC course recommendation; online programming education modeling and knowledge tracing; enhancing pre-trained language models by self-supervised learning for story cloze test; MOOCRec: An attention meta-path based model for Top-K recommendation in MOOC; PVFNet: Point-view fusion network for 3D shape recognition; HEAM: Heterogeneous network embedding with automatic meta-path construction; a graph attentive network model for P2P lending fraud detection; an empirical study on recent graph database systems; graph embedding based on characteristic of rooted subgraph structure; bibliometric analysis of twitter knowledge management publications related to health promotion; automatic cerebral artery system labeling using registration and key points tracking; page-level handwritten word spotting via discriminative feature learning; NADSR: A network anomaly detection scheme based on representation; a knowledge-based scheduling method for multi-satellite range system; IM-net: Semantic segmentation algorithm for medical images based on mutual information maximization; fast backward iterative laplacian score for unsupervised feature selection; improving low-resource chinese event detection with multi-task learning; feature selection using sparse twin support vector machine with correntropy-induced loss; customized decision tree for fast multi-resolution chart patterns classification; knowledge graphs meet geometry for semi-supervised monocular depth estimation; predicting user influence in the propagation of toxic information; extracting distinctive shapelets with random selection for early classification; butterfly-based higher-order clustering on bipartite networks; preface.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#916
13th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2021

Lect. Notes Comput. Sci. 2021;12672 LNAI():

2021

Ref ID: 5630

The proceedings contain 67 papers. The special focus in this conference is on Intelligent Information and Database Systems. The topics include: Entropy-Based Variational Learning of Finite Generalized Inverted Dirichlet Mixture Model; mixture-Based Unsupervised Learning for Positively Correlated Count Data; phase Prediction of Multi-principal Element Alloys Using Support Vector Machine and Bayesian Optimization; VEGAS: A Variable Length-Based Genetic Algorithm for Ensemble Selection in Deep Ensemble Learning; demand Forecasting for Textile Products Using Statistical Analysis and Machine Learning Algorithms; parallelization of Reinforcement Learning Algorithms for Video Games; A Gap–Based Memetic Differential Evolution (GaMeDE) Applied to Multi–modal Optimisation – Using Multi–objective Optimization Concepts; simulating Emergency Departments Using Generalized Petri Nets; what Do You Know About Your Network: An Empirical Study of Value Network Awareness in E-commerce; investigating Crossover Operators in Genetic Algorithms for High-Utility Itemset Mining; a Robust Approach to Employee Competences in Project Management; key Aspects of Customer Intelligence in the Era of Massive Data; how Spatial Data Analysis Can Make Smart Lighting Smarter; convolutional Neural Networks for Web Documents Classification; sequential Model-Based Optimization for Natural Language Processing Data Pipeline Selection and Optimization; automatic Cyberbullying Detection on Twitter Using Bullying Expression Dictionary; development of Morphological Segmentation for the Kyrgyz Language on Complete Set of Endings; empirical Study of Tweets Topic Classification Using Transformer-Based Language Models; a New Approach for Measuring the Influence of Users on Twitter; building a Domain-Specific Knowledge Graph for Business Networking Analysis; complexes of Low Dimensional Linear Classifiers with L1 Margins; N-Tier Machine Learning-Based Architecture for DDoS Attack Detection.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?

#910
10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021

Lect. Notes Comput. Sci. 2021;13028 LNAI():

2021

Ref ID: 5650

The proceedings contain 116 papers. The special focus in this conference is on Natural Language Processing and Chinese Computing. The topics include: Adaptive Transformer for Multilingual Neural Machine Translation; improving Non-autoregressive Machine Translation with Soft-Masking; AutoNLU: Architecture Search for Sentence and Cross-sentence Attention Modeling with Re-designed Search Space; autoTrans: Automating Transformer Design via Reinforced Architecture Search; a Word-Level Method for Generating Adversarial Examples Using Whole-Sentence Information; RAST: A Reward Augmented Model for Fine-Grained Sentiment Transfer; pre-trained Language Models for Tagalog with Multi-source Data; accelerating Pretrained Language Model Inference Using Weighted Ensemble Self-distillation; employing Sentence Compression to Improve Event Coreference Resolution; chinese Macro Discourse Parsing on Dependency Graph Convolutional Network; BRCEA: Bootstrapping Relation-Aware Cross-Lingual Entity Alignment; employing Multi-granularity Features to Extract Entity Relation in Dialogue; attention Based Reinforcement Learning with Reward Shaping for Knowledge Graph Reasoning; entity-Aware Relation Representation Learning for Open Relation Extraction; ReMERT: Relational Memory-Based Extraction for Relational Triples; recognition of Nested Entity with Dependency Information; HAIN: Hierarchical Aggregation and Inference Network for Document-Level Relation Extraction; Incorporate Lexicon into Self-training: A Distantly Supervised Chinese Medical NER; diversified Paraphrase Generation with Commonsense Knowledge Graph; explore Coarse-Grained Structures for Syntactically Controllable Paraphrase Generation; predicting Categorial Sememe for English-Chinese Word Pairs via Representations in Explainable Sememe Space; chinese Poetry Generation with Metrical Constraints; CNewSum: A Large-Scale Summarization Dataset with Human-Annotated Adequacy and Deducibility Level; question Generation from Code Snippets and Programming Error Messages; extractive Summarization of Chinese Judgment Documents via Sentence Embedding and Memory Network; thinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension.

Davis voted
mohammed afaan voted
Final decision
What was the agreed final decision?